charset.school
Decode UTF-32

sandbox

UTF-32 sandbox (decode)

Convert 4 UTF-32 bytes into a Unicode code point, given the endianness.

Hex (with or without 0x). UTF-32 takes exactly 4 bytes per code point.

Endianness
U+1F389🎉
4 bytes

Decimal

127881

Bytes

0x89 0xF3 0x01 0x00

Binary

10001001
11110011
00000001
00000000

Step-by-step breakdown

  1. 01

    Given endianness

    The 4 bytes are read low-order byte first (Little Endian). Mentally reverse the byte order before assembling the number.

    Little Endian (LE)
  2. 02

    Reassemble the binary

    Concatenate the 4 bytes (reordered as Big Endian) to reform the full 32-bit binary.
    In UTF-32 this binary IS the code point - no marker to strip, no surrogate to recombine.

    00000000000000011111001110001001
  3. 03

    Convert to a code point

    The binary equals 127881 in decimal, i.e. U+1F389 in Unicode notation.

    U+1F389
charset.school

Teaching tool. No tracking, no ads.

Developed by Florent Sorel