sandbox
UTF-16 sandbox (encode)
Convert a Unicode code point into UTF-16 bytes, with a chosen endianness.
Accepts U+XXXX, 0xXX, decimal, or a single character.
Decimal
127881
Hexadecimal
0x3C 0xD8 0x89 0xDF
Binary
Step-by-step breakdown
- 01
Pick the endianness
The low-order byte of each code unit comes first (
Little Endian).Little Endian (LE) - 02
Pick the UTF-16 form
Range
U+10000→U+10FFFF- beyond the BMP, in the supplementary planes (emojis, historic scripts, rare CJK...). Encoded as a surrogate pair: 2 code units = 4 bytes.2 code units · surrogate pair - 03
Convert to binary
Subtract
0x10000and keep the remaining 20 bits.00001111001110001001 - 04
Split for the surrogates
Split the 20 bits into 2 packets of 10 bits.
The left 10 bits (high-order) form the high surrogate, the right 10 bits (low-order) form the low surrogate.
Each 10-bit packet represents an integer between 0 and 1,023 (10 bits = 2¹⁰ = 1,024 values). Add it to the surrogate base:0xD800for the high,0xDC00for the low.0000111100 | 1110001001 - 05
Convert to hexadecimal
The 2 code units (high + low surrogate) yield 4 bytes, each code unit ordered by endianness.
0x3C 0xD8 0x89 0xDF