Conventions of an RTF Reader

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

 

The reader of an RTF stream is concerned with the following:

  • Separating control information from plain text.
  • Acting on control information.
  • Collecting and properly inserting text into the document, as directed by the current group state.

Acting on control information is designed to be a relatively simple process. Some control information simply contributes special characters to the plain text stream. Other information serves to change the program state, which includes properties of the document as a whole, or to change any of a collection of group states, which apply to parts of the document.

As previously mentioned, a group state can specify the following:

  • The destination, or part of the document that the plain text is constructing.
  • Character-formatting properties, such as bold or italic.
  • Paragraph-formatting properties, such as justified or centered.
  • Section-formatting properties, such as the number of columns.
  • Table-formatting properties, which define the number of cells and dimensions of a table row.

In practice, an RTF reader will evaluate each character it reads in sequence as follows:

  • If the character is an opening brace ({), the reader stores its current state on the stack. If the character is a closing brace (}), the reader retrieves the current state from the stack.
  • If the character is a backslash (\), the reader collects the control word or control symbol and its parameter, if any, and looks up the control word or control symbol in a table that maps control words to actions. It then carries out the action prescribed in the table. (The possible actions are discussed below.) The read pointer is left before or after a control-word delimiter, as appropriate.
  • If the character is anything other than an opening brace ({), closing brace (}), or backslash (\), the reader assumes that the character is plain text and writes the character to the current destination using the current formatting properties.

If the RTF reader cannot find a particular control word or control symbol in the look-up table described above, the control word or control symbol should be ignored. If a control word or control symbol is preceded by an opening brace ({), it is part of a group. The current state should be saved on the stack, but no state change should occur. When a closing brace (}) is encountered, the current state should be retrieved from the stack, thereby resetting the current state. If the \* control symbol precedes a control word, then it defines a destination group and was itself preceded by an opening brace ({). The RTF reader should discard all text up to and including the closing brace (}) that closes this group. All RTF readers must recognize all destinations defined in the March 1987 RTF Specification. The reader may skip past the group, but it is not allowed to simply discard the control word. Destinations defined since March 1987 are marked with the \* control symbol.

**Note   **All RTF readers must implement the \* control symbol so that they can read RTF files written by newer RTF writers.

For control words or control symbols that the RTF reader can find in the look-up table, the possible actions are as follows.

Action Description
Change Destination The RTF reader changes the destination to the destination described in the table entry. Destination changes are legal only immediately after an opening brace ({ ). (Other restrictions may also apply; for example, footnotes cannot be nested.) Many destination changes imply that the current property settings will be reset to their default settings. Examples of control words that change destination are \footnote, \header, \footer, \pict, \info, \fonttbl, \stylesheet, and \colortbl. This RTF Specification identifies all destination control words where they appear in control-word tables.
Change Formatting Property The RTF reader changes the property as described in the table entry. The entry will specify whether a parameter is required. The "Appendix C: Index of RTF Control Words" section at the end of this RTF Specification also specifies which control words require parameters. If a parameter is needed and not specified, then a default value will be used. The default value used depends on the control word. If the control word does not specify a default, then all RTF readers should assume a default of 0.
Insert Special Character The reader inserts into the document the character code or codes described in the table entry.
Insert Special Character and Perform Action The reader inserts into the document the character code or codes described in the table entry and performs whatever other action the entry specifies. For example, when Microsoft Word interprets \par, a paragraph mark is inserted in the document and special code is run to record the paragraph properties belonging to that paragraph mark.