Export (0) Print
Expand All

Working with a Shape's Text in DatadiagramML

Office 2010

Several groups of elements are involved in working with the text of a shape—the elements that describe characteristics of the text, those elements that describe the beginning and end of text runs, and an element that contains the text itself.

The following table gives a brief description of these elements and their relationships.

Element

Description

<Shape>

Opening tag for the Shape element.

Opening tag for the shape's text.

Marks character runs.

Marks paragraph runs.

Marks tab runs.

Marks Field position.

Closing tag for the shape's text.

Contains character properties.

Contains paragraph properties.

Contains tab properties.

Contains the shape's text field properties.

Closing tag for the Shape element.

For example, consider the following shape. It has text that consists of two character runs, one in bold and the other in italic.

<media mediatype="gif" filename="XML_04_ZA07645870.gif" assetid="ZA07645870" lcid=" " alttext="Text that consists of two character runs, one in bold and the other in italic"/>

The portion of the shape's sheet that describes character text runs is called the Character section. The Character rows for this shape as viewed in the ShapeSheet window are as follows:

<media assetid="ZA01108249" lcid=" " filename="XML_007_ZA01108249.gif" mediatype="gif" alttext="Character rows from the ShapeSheet." alttextsource="ua"></media>

Note that the column on the left indicates the length of each run, 4 characters and 7 characters, respectively. The only cells that vary between the two rows are the Style cell values. The Style cell determines if the run is bold, italic, or another style. The XML code for these rows follows:

<Char IX='0'>
      <Font>4</Font>
      <Color>0</Color>
      <Style>17</Style>
      <Case>0</Case>
      <Pos>0</Pos>
      <FontScale>1</FontScale>
      <Size Unit='PT'>0.1666666666666667</Size>
      <DblUnderline>0</DblUnderline>
      <Overline>0</Overline>
      <Strikethru>0</Strikethru>
      <Highlight>0</Highlight>
      <DoubleStrikethrough>0</DoubleStrikethrough>
      <RTLText>0</RTLText>
      <UseVertical>0</UseVertical>
      <Letterspace>0</Letterspace>
      <ColorTrans>0</ColorTrans>
      <AsianFont>0</AsianFont>
      <ComplexScriptFont>0</ComplexScriptFont>
      <LocalizeFont>0</LocalizeFont>
      <ComplexScriptSize>-1</ComplexScriptSize>
      <LangID>1033</LangID>
</Char>
<Char IX='1'>
      <Font>4</Font>
      <Color>0</Color>
      <Style>34</Style>
      <Case>0</Case>
      <Pos>0</Pos>
      <FontScale>1</FontScale>
      <Size Unit='PT'>0.1666666666666667</Size>
      <DblUnderline>0</DblUnderline>
      <Overline>0</Overline>
      <Strikethru>0</Strikethru>
      <Highlight>0</Highlight>
      <DoubleStrikethrough>0</DoubleStrikethrough>
      <RTLText>0</RTLText>
      <UseVertical>0</UseVertical>
      <Letterspace>0</Letterspace>
      <ColorTrans>0</ColorTrans>
      <AsianFont>0</AsianFont>
      <ComplexScriptFont>0</ComplexScriptFont>
      <LocalizeFont>0</LocalizeFont>
      <ComplexScriptSize>-1</ComplexScriptSize>
      <LangID>1033</LangID>
</Char>
NoteNote

Although the size of the text is 12 points, the internal unit for points is inches. So the point size of "12" is represented in inches as "0.16666666666667."

The two Char elements represent the two Character rows. Each has an IX attribute describing the relative order of the rows. Each Char element contains child elements that correspond to the cells viewed in the ShapeSheet window.

The Shape element also contains an element called Text, which contains the characters of the text and special elements (cp, pp, tp, and fld) that mark the end of one run and the beginning of the next. The Text element is somewhat unusual because it contains both data (the text characters) and child elements (cp, pp, and so on).

The Text element that describes the text on the preceding shape follows:

<Text><cp IX='0'/>Bold <cp IX='1'/>Italic</Text>

The

<cp IX='0'>

tag indicates that the character properties from the first Char element

<Char IX='0'>

are to be applied to the text that follows. The

<cp IX='1'/>

tag indicates that the preceding character run

<Char IX='0'>

has ended and the run attributed to

<Char IX='1'>

has begun.

Inheritance for text elements follows the standard inheritance rules. Text row elements (Char, Para, Tabs and Field) need to contain child elements only for elements whose values are different from their inherited value. For example, the statement

<Char IX='0'><Color>4</Color</Char>

is sufficient to specify that a Char element should have blue text; the remaining element values will automatically be inherited.

NoteNote

Microsoft® Visio® writes out all values—inherited or not—when it saves a DatadiagramML file.

The following special cases apply:

  • If the text row elements are in a master or a non-instance shape (a shape that is not an instance of a master), the values are inherited from the shape's text style, that is, the ID of the StyleSheet that is referenced in the shape's TextStyle attribute. If the shape does not contain a TextStyle attribute, the values are inherited from the default document style.

  • If the text row elements are in an instance shape (a shape that is an instance of a master), the values are inherited from the corresponding row in the master. If a corresponding row does not exist, values are inherited from the first row in the master. (An instance of a master that has a TextStyle attribute inherits text properties from the style instead of the master.)

When Visio loads text from an XML file that is created or edited outside of Visio (an untrusted file), the first thing it does is normalize the text data, that is, it fills in, adds, or reorders missing elements and removes unused elements.

Following are some of the ways in which Visio normalizes untrusted text data in a VDX file at load time.

  • When Visio emits a DatadiagramML file, all cp, pp, tp, and fld elements in the Text element have IX values that are sequential. In other words, a

    <cp IX ='0'/>
    

    tag always precedes a

    <cp IX ='1'/>
    

    tag, and so on.

  • If Visio encounters run markers that are out of order when it reads a DatadiagramML file, it inserts duplicates of the text run rows as needed. For example, say Visio reads the following file:

    <Char IX='0'> <Style> 0 </Style> </Char>
    <Char IX='1'> <Style> 1 </Style> </Char>
    
    <Para IX='0'> <HorzAlign> 0 </HorzAlign> </Char>
    <Para IX='1'> <HorzAlign> 1 </HorzAlign> </Char>
    
    <Text><cp IX='0'/><pp IX='0'/>The quick <cp IX='1'/>brown fox 
    <pp IX='1'/>jumped <cp IX='0'/>over the lazy dog.</Text>
    

    The code describes a text block that contains three character runs and two paragraph property runs. There is a single paragraph break after the word "fox." The block would be rendered like this:

    <media mediatype="gif" filename="XML_05_ZA07645871.gif" assetid="ZA07645871" lcid=" " alttext="The quick brown fox jumped over the lazy dog. This text block contains three character runs and two paragraph property runs."/>

    In the preceding example, the Char row 0 is referenced twice

    <cp IX ='0'/>
    

    Visio normalizes this by creating a third Char row

    <Char IX='2'>
    

    which is identical to Char row 0.

    When the file is resaved, it contains the third Char row and an additional cp element marking the third character run, which is demonstrated as follows. In addition, the XML file will contain all the child elements (Font, Size, FontScale, and so on) for all the Char and Para elements.

    <Char IX='0'>...<Style> 0 </Style>...</Char>
    <Char IX='1'>...<Style> 1 </Style>...</Char>
    <Char IX='2'>...<Style> 0 </Style>...</Char>
    
    <Para IX='0'>...<HorzAlign> 0 </HorzAlign>...</Char>
    <Para IX='1'>...<HorzAlign> 1 </HorzAlign>...</Char>
    
    <Text><cp IX ='0'/><pp IX='0'>The quick <cp IX ='1'/>brown fox
    <pp IX ='1'/>jumped <cp IX ='2'/>over the lazy dog.</Text>
    
  • Although it is not an error to omit the marker for the first run, it is recommended that you include it. Visio always emits the initial marker element when the DatadiagramML data is round-tripped.

  • Paragraph run markers

    <pp IX='1'>
    

    and tab run markers

    <tp IX='1'>
    

    are valid only at the beginning of a paragraph. If such a marker is encountered in the middle of a paragraph, it is ignored. A new line is required before each pp and tp element.

  • Visio normalizes untrusted text data when it is loaded into the application. This means that Visio performs further processing—for example, local overrides or local deletions—on the normalized data. When you are editing or generating your own DatadiagramML files, you'll get better performance results if you try to mimic the XML that Visio generates.

Show:
© 2014 Microsoft