2.2.3 [ISO10646] Section D.7, Incorrect sequences of octets: Interpretationby receiving devices
The specification states:
According to D.2 an octet in the range 00 to 7F or C0 to FB is the first octet of a UTF-8 sequence, and is followed by the appropriate number (from 0 to 5) of continuing octets in the range 80 to BF. Furthermore, octets whose value is FE or FF are not used; thus they are invalid in UTF-8. If a CC-data-element includes either: * a first octet that is not immediately followed by the correct number of continuing octets, or * one or more continuing octets that are not required to complete a sequence of first and continuing octets, or * an invalid octet, then according to D.2 such a sequence of octets is not in conformance with the requirements of UTF-8. It is known as a malformed sequence. If a receiving device that has adopted the UTF-8 form receives a malformed sequence, because of error conditions either: * in an originating device, or * in the interchange between an originating and a receiving device, or * in the receiving device itself, then it shall interpret that malformed sequence in the same way that it interprets a character that is outside the adopted subset that has been identified for the device (see sub-clause 2.3c).
All Document Modes (All Versions)
Incorrect octets are replaced with the character