Convert text to UTF-8 bytes or decode UTF-8 byte sequences back to readable text. Supports hex, decimal, binary, and percent-encoded formats.
Enter text or UTF-8 bytes
Type or paste text to encode to UTF-8 bytes, or enter UTF-8 byte values (hex, decimal, binary, or percent-encoded) to decode.
Select format and click Encode or Decode
Choose your byte format (hex, decimal, binary, percent), then click Encode to convert text to bytes or Decode to convert bytes to text.
View analysis and copy result
Enable character analysis to see byte counts per character. Click Copy to copy the output to your clipboard.
The Chinese character 中 (U+4E2D) encodes to 3 bytes in UTF-8: E4 B8 AD (hex) or 228 184 173 (decimal). In binary: 11100100 10111000 10101101. This follows UTF-8's 3-byte pattern for CJK characters (code points U+0800 to U+FFFF). Similarly, 日 is E6 97 A5, 本 is E6 9C AC, and 語 is E8 AA 9E.
UTF-8 (Unicode Transformation Format - 8-bit) is the dominant character encoding for the web, used by over 98% of websites. It efficiently encodes all Unicode characters using 1-4 bytes: ASCII characters use 1 byte, most European and Middle Eastern scripts use 2 bytes, most Asian characters use 3 bytes, and emoji use 4 bytes. UTF-8 is backwards-compatible with ASCII, making it ideal for modern software development.
Yes, this UTF-8 tool is completely free with no limitations. All encoding and decoding happens locally in your browser - your text is never sent to any server. This makes it safe to process text in any language, including content containing sensitive information.
Enter your text in the input field and click 'Encode to UTF-8'. The tool will convert each character to its UTF-8 byte sequence. You can view the result in hexadecimal (48 65 6C 6C 6F), decimal (72 101 108), binary (01001000), or percent-encoded (%48%65%6C) format. Enable 'Show character analysis' to see byte details for each character.
Enter your UTF-8 bytes in the input field using the format you have (hex, decimal, binary, or percent-encoded), select the matching format from the dropdown, and click 'Decode from UTF-8'. The tool validates the byte sequence and converts it back to readable text. Invalid UTF-8 sequences will show a specific error message.
UTF-8 encoding shows the actual byte representation of characters (e.g., '世' = E4 B8 96 in hex), while Unicode escape sequences use code point notation (e.g., '世' = \u4E16). UTF-8 is how text is actually stored and transmitted, while escape sequences are a way to represent Unicode in source code. Use our Unicode Escape tool for \uXXXX format conversions.