IWordBreaker::BreakText method

Parses text to identify words and phrases and provides the results to the IWordSink and IPhraseSink objects.

Syntax


HRESULT BreakText(
  [in] TEXT_SOURCE *pTextSource,
  [in] IWordSink   *pWordSink,
  [in] IPhraseSink *pPhraseSink
);

Parameters

pTextSource [in]

Type: TEXT_SOURCE*

Pointer to a TEXT_SOURCE structure that contains Unicode text.

pWordSink [in]

Type: IWordSink*

Pointer to the IWordSink object that receives and handles words generated by this method. NULL indicates that this method should identify phrases only.

pPhraseSink [in]

Type: IPhraseSink*

Pointer to the IPhraseSink object that receives and handles phrases generated by this method. NULL indicates that this method should identify individual words, not phrases.

Return value

Type: HRESULT

This method can return one of these values.

Return codeDescription
S_OK

Operation was successful. No more text is available to refill the pTextSource buffer.

E_INVALIDARG

Invalid argument. The pTextSource parameter is NULL.

 

Remarks

Because word breakers more commonly parse for words than phrases, you should optimize for pPhraseSink = 0. Either pWordSink or pPhraseSink can be NULL, but not both.

The IWordSink object holds the words and their alternative forms for the word breaker. Alternative forms of words, if they exist, are put in the IWordSink object first, by using the WordSink::PutAltWord method, and the root word is added last, by using the WordSink::PutWord method.

Use pfnFillTextBuffer, the function pointer element in the TEXT_SOURCE structure, to replenish the source text. The IWordBreaker::BreakText method must handle all pfnFillTextBuffer return values. If an error occurs, finish processing the text in the buffer before handling the error.

Requirements

Minimum supported client

Windows 2000 Professional [desktop apps only]

Minimum supported server

Windows 2000 Server [desktop apps only]

Redistributable

Windows NT 4.0 Option Pack

Header

Indexsrv.h

See also

IWordBreaker
TEXT_SOURCE

 

 

Show: