charset.school
Encode UTF-8

sandbox

UTF-8 sandbox (encode)

Convert a Unicode code point to UTF-8.

Accepts U+XXXX, 0xXX, decimal, or a single character.

U+00E9é
2 bytes

Decimal

233

Hexadecimal

0xC3 0xA9

Binary

11000011
10101001

Step-by-step breakdown

  1. 01

    Pick the UTF-8 form

    Range U+0080 -> U+07FF - 11 useful bits, beyond ASCII.

    2 bytes · U+0080 → U+07FF
  2. 02

    Convert to binary

    Code point U+00E9 fits in 11 significant bits (padded to the form's payload length).

    00011101001
  3. 03

    Split into chunks

    Split by the form's payload-slot widths (5 + 6 bits).

    00011 | 101001
  4. 04

    Insert the markers

    The first byte carries the marker 110 (3 bits); the next carries 10 (continuation).

    byte 1
    11000011
    byte 2
    10101001
  5. 05

    Convert to hexadecimal

    Each binary byte becomes its hex value on 2 digits.

    0xC3 0xA9
charset.school

Teaching tool. No tracking, no ads.

Developed by Florent Sorel