Character encoding conversion

Batch display of Unicode, UTF-8, UTF-16, Shift_JIS, and decimal code points of characters. Reverse conversion is also supported.

Characters:0 Surrogate pair:0

Usage and Application Examples

  • Character → Code: List Unicode, UTF-8, UTF-16, Shift_JIS, and decimal codes one character at a time as you type text
  • Codes → Characters: Recover original characters from codes such as U+3042 and E3 81 82
  • Garbled investigation: compare byte strings in each encoding to determine the cause of garbled characters
  • Development: Useful for checking escape sequences and byte sequences in programs

What is Character Encoding Converter?

Character encoding conversion is a critical skill for developers, content creators, and language professionals working with international text. This free tool instantly converts individual characters or entire text blocks into multiple encoding formats including Unicode (U+XXXX notation), UTF-8 byte sequences, UTF-16 representations, Shift_JIS values, and decimal code points. Whether you're debugging mysterious character display issues, preparing text for legacy systems, or simply learning how different platforms represent the same character, this converter eliminates manual calculation errors and saves time.

How to Use

Simply paste your text or individual characters into the input field and the tool displays conversions across all major encoding formats simultaneously. For single characters, you'll see the Unicode code point (for example, U+0041 for the letter 'A'), the UTF-8 byte sequence in hexadecimal, UTF-16 representation, Shift_JIS value for Japanese systems, and the decimal code point. Copy any format directly from the results with a single click. The tool intelligently handles both basic ASCII characters and complex Unicode including emojis, CJK characters, mathematical symbols, and special punctuation marks.

Use Cases

Web developers debugging character encoding issues in HTML, CSS, or JavaScript files can quickly identify if a character's UTF-8 representation matches their source file encoding. Content creators preparing multilingual documents for different publishing platforms need to verify character codes are compatible with each target system. Database administrators migrating legacy data from Shift_JIS systems to modern UTF-8 databases use this tool to validate that character transformations maintain data integrity. Linguists, font designers, and Unicode researchers studying character sets reference this tool to confirm code points for specific glyphs, ensuring visual and technical consistency across platforms and applications.

Tips & Insights

UTF-8 is the internet standard for character encoding, supporting all Unicode characters through variable-length bytes (1-4 bytes per character). Shift_JIS remains the legacy encoding for Japanese text in older systems and some databases. Understanding character encoding prevents mojibake—garbled text that appears when character data is misinterpreted. Unicode's U+ notation is the international standard for referring to any character across languages. For emoji and modern symbols, always verify UTF-8 compatibility since older systems often fail to display them correctly.

Frequently Asked Questions

What is character encoding?

Character encoding is a set of rules for representing characters as numbers (byte strings) on a computer; there are various schemes such as UTF-8, UTF-16, and Shift_JIS, each of which has a different way of converting characters into byte strings.

What is the difference between UTF-8 and UTF-16?

UTF-8 is a variable-length encoding of 1 to 4 bytes, with ASCII characters represented by 1 byte; UTF-16 is a variable-length encoding of 2 or 4 bytes, with BMP (Basic Multilingual Plane) characters represented by 2 bytes; UTF-8 is the standard on the Web.

What is Shift_JIS?

Shift_JIS is one of the character encodings for Japanese, capable of representing JIS X 0201 and JIS X 0208 characters. It was once widely used in Windows and on the Web, but is now being shifted to UTF-8.

What is a Unicode code point?

Unicode code points are unique numbers assigned to each character in the Unicode standard, represented as U+ followed by a hexadecimal number, such as U+0041 (A) or U+3042 (A). The code point itself is not an encoding, but a character identification number.

What is the cause of the garbled text?

Garbled characters occur when the encoding used to write the text is different from the encoding used to read it. For example, a file saved in UTF-8 will be garbled when opened as Shift_JIS. You can check the byte sequence for each encoding with this tool.

Can I check the Shift_JIS code with this tool?

Yes. The Shift_JIS code point (hexadecimal number) of the entered character is displayed. However, characters that do not exist in Shift_JIS (such as pictographs and some Kanji characters) will be displayed as "N/A". All conversions are performed in the browser.

How can I convert text with special characters or emoji?

Yes, you can paste any text including special characters, emoji, and symbols. The tool will display each character's Unicode point, UTF-8 byte sequence, UTF-16 encoding, Shift_JIS (if available), and decimal code point.

What's the difference between decimal and hexadecimal notation for character codes?

Decimal notation (e.g., 65 for 'A') uses base-10, while hexadecimal (U+0041) uses base-16. Unicode code points are typically written in hexadecimal format, but this tool also shows decimal equivalents for reference.

Can I convert in reverse (from code to character)?

Yes, you can paste Unicode code points, HTML entities, or UTF-8 byte sequences and the tool will convert them back to readable characters. Just input the codes in any supported format.

Why do some characters show different results for Shift_JIS encoding?

Shift_JIS is a legacy Japanese encoding with limited character support. Characters not in Shift_JIS's character set will be unavailable or replaced with alternative representations.

How can I batch process multiple text inputs at once?

You can copy large blocks of text, paste them all at once, and the tool will analyze each character individually, making it easy to check encodings for entire documents.

What's the maximum text length I can process?

The tool is designed for practical batch conversion, but for very large documents (thousands of characters), performance may vary. You can test with your typical input size to verify speed.