2.1.3.1.4.2 CONTENT HTML Fragment

The CONTENT HTML fragment in an HTMLTAG destination group contains parts of original HTML markup or other text that are not duplicated or significantly transformed in RTF content, such as HTML tags, text that might include HTML character references, and HTML comments.<6>

It is possible that some text in the CONTENT HTML fragment will need to be escaped or converted to RTF control words to produce proper RTF. The following table specifies valid RTF escape tokens and control words that can be used in the CONTENT HTML fragment. A de-encapsulating RTF reader MAY<7> fail to extract the original HTML when other RTF control words are included in the CONTENT HTML fragment.

RTF escape tokens and control words

Corresponding HTML text

\par

%x0D.0A (OCTET sequence CRLF)

\tab

%x09 (OCTET form for the horizontal tab character)

\{

%x7B (OCTET form for {)

\}

%x7D (OCTET form for })

\\

%x5C (OCTET form for reverse solidus '\')

\lquote

"&lsquo;" (Unicode value U+2018)

\rquote

"&rsquo;" (Unicode value U+2019)

\ldblquote

"&ldquo;" (Unicode value U+201C)

\rdblquote

"&rdquo;" (Unicode value U+201D)

\bullet

"&bull;" (Unicode value U+2022)

\endash

"&ndash;" (Unicode value U+2013)

\emdash

"&mdash;" (Unicode value U+2014)

\~

"&nbsp;" (non-breaking space)

\_

"&shy;" (&#173; soft hyphen)

\'HH

%xHH (OCTET with the hexadecimal value of HH)

\u[-]NNNNN

"&#xHHHH;" where:

§ NNNNN is a positive integer expressed in decimal digits

§ -NNNNN is a negative integer expressed in decimal digits

§ HHHH is the hexadecimal equivalent of NNNNN or -NNNNN

\uc

No visual representation in HTML.