UTF8MB4
Introduced in MySQL version 5.5.3, is an extension of the UTF-8 character encoding scheme. While UTF-8 can encode 1.1 million characters, UTF8MB4 can encode the full range of Unicode characters, including emojis and characters outside the Basic Multilingual Plane (BMP).
utf8mb4
: A UTF-8 encoding of the Unicode character set using one to four bytes per character.utf8mb3
: A UTF-8 encoding of the Unicode character set using one to three bytes per character.
In MySQL utf8
is currently an alias for utf8mb3
which is deprecated and will be removed in a future MySQL release. At that point utf8
will become a reference to utf8mb4
.
So regardless of this alias, you can consciously set yourself an utf8mb4
encoding.
UTF-8 is a variable-length encoding. In the case of UTF-8, this means that storing one code point requires one to four bytes. However, MySQL’s encoding called “utf8” (alias of “utf8mb3”) only stores a maximum of three bytes per code point.
The utf8mb4
character set is useful because nowadays we need support for storing not only language characters but also symbols, newly introduced emojis, and so on.
Cloud SQL – GCP
Google Cloud Platform [GCP] by default enables and uses utf8
when creating a Database and that is aliased to utf8mb3
which is okay for most cases.
I was trying this out when using wordpress and it was not enough of it for me.
So i was searching for a best option and i found out that utf8mb4
was the go to solution for it.
Leave a Reply
You must be logged in to post a comment.