2.1.3.3 Encoding Plain Text into RTF

The translation between plain text and RTF is not specified by this algorithm and is implementation dependent. Implementers MUST produce a valid RTF document, as specified by [MSFT-RTF]. Implementers MUST emit a FROMTEXT control word in the RTF header, after the \rtf1 control word, to indicate that RTF was produced from plain text. Implementers SHOULD specify a default code page for text runs in RTF by using the \ansicpgN control word, as specified in [MSFT-RTF].

Implementers can emit a font table to define fonts used in RTF. Implementers SHOULD specify charset information for each font when necessary, as specified in [MSFT-RTF].

Implementers MUST NOT use HTMLTAG destination groups or the FROMHTML control word in RTF content marked with the FROMTEXT control word. All textual content MUST be represented directly in RTF. Implementers SHOULD produce text in a code page that corresponds to the current font for each text run, or in a default RTF code page if no current font is selected for a text run.

Any characters that cannot be represented in a selected code page SHOULD be encoded by using the \uN control word. Any resulting characters that are not allowed or have a special meaning in RTF syntax MUST be escaped, as specified in [MSFT-RTF]. Any line-ending character sequence (such as CRLF, CR, or LF) MUST be converted to RTF as \par or \line RTF control words. Implementers can add other formatting RTF control words that do not have textual representation (for example, to improve the presentation quality of the resulting RTF).