sandbox
UTF-8 sandbox (decode)
Convert a sequence of UTF-8 bytes into a Unicode code point.
Hex (with or without 0x), separated by spaces / commas / no separator.
U+00E9é
2 bytesDecimal
233
Bytes
0xC3 0xA9
Binary
11000011
10101001
Step-by-step breakdown
- 01
Identify the byte count
First byte
110xxxxx(two1s then0) - 2-byte form.2 bytes · U+0080 → U+07FF - 02
Extract the data bits
For each byte, strip the format marker (
110/1110/11110on the leader,10on continuations) - what remains are the data bits.00011 | 101001 - 03
Reassemble the binary
Concatenate the groups to rebuild the code point's binary (11 significant bits).
00011101001 - 04
Convert to a code point
The binary equals
233in decimal, i.e.U+00E9in Unicode notation.U+00E9