There are several encodings defined by ISO 10646 for the Universal Character Set. Most common is UCS-2 uses two bytes for each character. This permits every code point in the BMP to be represented by two bytes. Code points outside the BMP can be represented by four bytes, i.e. a pair of two byte sequences.
Another encoding defined is UCS-4, which uses four bytes for each character. This can represent every code point in the character set, including those outside the BMP, by four bytes. It has the advantage over UCS-2 of every character encoding being of the same length, which makes it simpler to manipulate; but it requires twice as much storage as UCS-2.