Windows Dev Center

ScriptItemizeOpenType function

Breaks a Unicode string into individually shapeable items and provides an array of feature tags for each shapeable item for OpenType processing.

Syntax


HRESULT ScriptItemizeOpenType(
  _In_     const WCHAR          *pwcInChars,
  _In_           int            cInChars,
  _In_           int            cMaxItems,
  _In_opt_ const SCRIPT_CONTROL *psControl,
  _In_opt_ const SCRIPT_STATE   *psState,
  _Out_          SCRIPT_ITEM    *pItems,
  _Out_          OPENTYPE_TAG   *pScriptTags,
  _Out_          int            *pcItems
);

Parameters

pwcInChars [in]

Pointer to a Unicode string to itemize.

cInChars [in]

Number of characters in pwcInChars to itemize.

cMaxItems [in]

Maximum number of SCRIPT_ITEM structures defining items to process.

psControl [in, optional]

Pointer to a SCRIPT_CONTROL structure indicating the type of itemization to perform.

Alternatively, the application can set this parameter to NULL if no SCRIPT_CONTROL properties are needed. For more information, see the Remarks section.

psState [in, optional]

Pointer to a SCRIPT_STATE structure indicating the initial bidirectional algorithm state.

Alternatively, the application can set this parameter to NULL if the script state is not needed. For more information, see the Remarks section.

pItems [out]

Pointer to a buffer in which the function retrieves SCRIPT_ITEM structures representing the items that have been processed. The buffer should be (cMaxItems + 1) * sizeof(SCRIPT_ITEM) bytes in length. It is invalid to call this function with a buffer that handles less than two SCRIPT_ITEM structures. The function always adds a terminal item to the item analysis array so that the length of the item with zero-based index "i" is always available as:

pItems[i+1].iCharPos - pItems[i].iCharPos;

pScriptTags [out]

Pointer to a buffer in which the function retrieves an array of OPENTYPE_TAG structures representing script tags. The buffer should be cMaxItems * sizeof(OPENTYPE_TAG) bytes in length.

Note   When all characters in an item are neutral, the value of this parameter is SCRIPT_TAG_UNKNOWN (0x00000000). This can happen, for example, if an item consists entirely of punctuation.
pcItems [out]

Pointer to the number of SCRIPT_ITEM structures processed.

Return value

Returns 0 if successful. The function returns a nonzero HRESULT value if it does not succeed. In all error cases, no items are fully processed and no part of the output contains defined values. The application can test the return value with the SUCCEEDED and FAILED macros.

The function returns E_OUTOFMEMORY if the size indicated by cMaxItems is too small. The application can try calling the function again with a larger buffer.

The function returns E_INVALIDARG if one or more of the following conditions occur:

  • pwcInChars is set to NULL
  • cInChars is 0
  • pItems is set to NULL
  • pScriptTags is set to NULL
  • cMaxItems < 2

Remarks

ScriptItemizeOpenType is preferred over the older ScriptItemize function. One advantage of ScriptItemizeOpenType is the availability of feature tags for each shapeable item.

See Displaying Text with Uniscribe for a discussion of the context in which this function is normally called.

The function delimits items by either a change of shaping engine or a change of direction.

The application can create multiple ranges, or runs that fall entirely within a single item, from each SCRIPT_ITEM structure retrieved by ScriptItemizeOpenType. However, it should not combine multiple items into a single run. When measuring or rendering, the application can call ScriptShapeOpenType for each run and must pass the corresponding SCRIPT_ANALYSIS structure in the SCRIPT_ITEM structure retrieved by ScriptItemizeOpenType.

If the text handled by an application can include any right-to-left content, the application uses the psControl and psState parameters in calling ScriptItemizeOpenType. However, the application does not have to do this and can handle bidirectional text itself instead of relying on Uniscribe to do so. The psControl and psState parameters are useful in some strictly left-to-right scenarios, for example, when the fLinkStringBefore member of SCRIPT_CONTROL is not specific to right-to-left scripts. The application sets psControl and psState to NULL to have ScriptItemizeOpenType break the Unicode string purely by character code.

The application can set all parameters to non-NULL values to have the function perform a full Unicode bidirectional analysis. To permit a correct Unicode bidirectional analysis, the SCRIPT_STATE structure should be initialized according to the reading order at paragraph start, and ScriptItemizeOpenType should be passed the whole paragraph. In particular, the uBidiLevel member should be initialized to 0 for left-to-right and 1 for right-to-left.

The fRTL member of SCRIPT_ANALYSIS is referenced in SCRIPT_ITEM. The fNumeric member of SCRIPT_PROPERTIES is retrieved by ScriptGetProperties. These members together provide the same classification as the lpClass member of GCP_RESULTS, referenced by lpResults in GetCharacterPlacement.

European digits U+0030 through U+0039 can be rendered as national digits, as shown in the following table.

SCRIPT_STATE.fDigitSubstituteSCRIPT_CONTROL.fContextDigitsDigit shapes displayed for Unicode U+0030 through U+0039
FALSEAnyEuropean digits
TRUEFALSEAs specified in uDefaultLanguage member of SCRIPT_CONTROL.
TRUETRUEAs prior strong text, defaulting to uDefaultLanguage member of SCRIPT_CONTROL.

 

In context digit mode, one of the following actions occurs:

  • If the script specified by uDefaultLanguage is in the same direction as the output, all digits encountered before the first letters are rendered in the language indicated by uDefaultLanguage.
  • If the script specified by uDefaultLanguage is in the opposite direction from the output, all digits encountered before the first letters are rendered in European digits.

For example, if uDefaultLanguage indicates LANG_ARABIC, initial digits are in Arabic-Indic in a right-to-left embedding. However they are in European digits in a left-to-right embedding.

For more information, see Digit Shapes.

The Unicode control characters and definitions, and their effects on SCRIPT_STATE members, are provided in the following table. For more information on Unicode control characters, see the The Unicode Standard.

Unicode control charactersMeaningEffect on SCRIPT_STATE
NADSOverride European digits (NODS) with national digit shapes.Set fDigitSubstitute.
NODSUse nominal digit shapes, otherwise known as European digits. See NADS.Clear fDigitSubstitute.
ASSActivate swapping of symmetric pairs, for example, parentheses. For these characters, left and right are interpreted as opening and closing. This is the default. See ISS.Clear fInhibitSymSwap.
ISSInhibit swapping of symmetric pairs. See ASS.Set fInhibitSymSwap.
AAFSActivate Arabic form shaping for Arabic presentation forms. See IAFS.Set fCharShape.
IAFSInhibit Arabic form shaping, that is, ligatures and cursive connections, for Arabic presentation forms. Nominal Arabic characters are not affected. This is the default. See AAFS.Clear fCharShape.

 

The fArabicNumContext member of SCRIPT_STATE supports the context-sensitive display of numerals in Arabic script text. It indicates if digits are rendered using native Arabic script digit shapes or European digits. At the beginning of a paragraph, this member should normally be initialized to TRUE for an Arabic locale, or FALSE for any other locale. The function updates the script state it as it processes strong text.

The output parameter pScriptTags indicates an array with entries parallel to items. For each item, this function retrieves a script tag that should be used for shaping in all subsequent operations.

A script tag is usually determined by ScriptItemizeOpenType from input characters. If the function retrieves a specific script tag, the application should pass it to other functions without change. However, when characters are neutral (for example, digits) and the script cannot be determined, the application should choose an appropriate script tag, for example, based on font and language associated with text.

Important  Starting with Windows 8: To maintain the ability to run on Windows 7, a module that uses Uniscribe must specify Usp10.lib before gdi32.lib in its library list.

Requirements

Minimum supported client

Windows Vista [desktop apps only]

Minimum supported server

Windows Server 2008 [desktop apps only]

Redistributable

Usp10.dll version 1.600 or greater on Windows XP

Header

Usp10.h

Library

Usp10.lib

DLL

Usp10.dll

See also

Uniscribe Functions
Displaying Text with Uniscribe
Digit Shapes
ScriptItemize
ScriptPlaceOpenType
ScriptShapeOpenType
ScriptSubstituteSingleGlyph
SCRIPT_ANALYSIS
SCRIPT_CONTROL
SCRIPT_ITEM
SCRIPT_STATE

 

 

Community Additions

ADD
Show:
© 2015 Microsoft