Convert text to Unicode code points and back. Enter any string to get its Unicode escape sequences, or paste Unicode codes to decode them back to readable text.
Unicode is a universal character encoding standard assigning a unique code point to every character in every language.
Unicode assigns code points (e.g. U+0041 for A). UTF-8 is an encoding that represents those code points as 1-4 bytes.
Emoji and special characters are rendered by the OS font and rendering engine. The same Unicode code point looks completely different on Apple, Google, and Microsoft platforms. Using UTF-8 encoding resolves most cross-platform text encoding issues — always specify UTF-8 as your document and database encoding.
Unicode is a character set standard that defines code points for over 1 million characters (U+0000 to U+10FFFF). UTF-8 and UTF-16 are different encoding schemes that implement Unicode. UTF-8: variable-length, 1 byte for ASCII, 3 bytes for CJK, backward compatible with ASCII — the web standard. UTF-16: variable-length, 2 bytes for the basic plane, used internally by JavaScript and Java. UTF-32: fixed 4 bytes per character, simple but space-wasteful.