Was this page helpful?
Your feedback about this content is important. Let us know what you think.
Additional feedback?
1500 characters remaining
Export (0) Print
Expand All

STAT_CHUNK structure

Describes the characteristics of a chunk.

Syntax


typedef struct tagSTAT_CHUNK {
  ULONG           idChunk;
  CHUNK_BREAKTYPE breakType;
  CHUNKSTATE      flags;
  LCID            locale;
  FULLPROPSPEC    attribute;
  ULONG           idChunkSource;
  ULONG           cwcStartSource;
  ULONG           cwcLenSource;
} STAT_CHUNK;

Members

idChunk

Type: ULONG

The chunk identifier. Chunk identifiers must be unique for the current instance of the IFilter interface. Chunk identifiers must be in ascending order. The order in which chunks are numbered should correspond to the order in which they appear in the source document. Some search engines can take advantage of the proximity of chunks of various properties. If so, the order in which chunks with different properties are emitted will be important to the search engine.

breakType

Type: CHUNK_BREAKTYPE

The type of break that separates the previous chunk from the current chunk. Values are from the CHUNK_BREAKTYPE enumeration.

flags

Type: CHUNKSTATE

Flags that indicate whether this chunk contains a text-type or a value-type property. Flag values are taken from the CHUNKSTATE enumeration. If the CHUNK_TEXT flag is set, IFilter::GetText should be used to retrieve the contents of the chunk as a series of words. If the CHUNK_VALUE flag is set, IFilter::GetValue should be used to retrieve the value and it should be treated as a single property value. If the filter dictates that the same content be treated both as text and as a value, the chunk should be emitted twice in two different chunks, each with one flag set.

locale

Type: LCID

The language and sublanguage associated with a chunk of text. Chunk locale is used by document indexers to perform proper word breaking of text. If the chunk is neither text-type nor value-type and is of data type VT_LPWSTR, VT_LPSTR or VT_BSTR, this field is ignored.

attribute

Type: FULLPROPSPEC

The property to be applied to the chunk. If a filter requires that the same text have more than one property, it must emit the text once for each property in separate chunks.

idChunkSource

Type: ULONG

The ID of the source of a chunk. The value of this member depends on the nature of the chunk:

  • If the chunk is a text-type property, the value of this member must be the same as the value of the idChunk member.
  • If the chunk is an internal value-type property derived from textual content, the value of this member is the chunk ID for the text-type chunk from which it is derived.
  • If the filter attributes specify to return only internal value-type properties, there is no content chunk from which to derive the current internal value-type property. In this case, the value of this member must be set to zero, which is an invalid chunk.
cwcStartSource

Type: ULONG

The offset at which the source text for a derived chunk starts in the source chunk.

cwcLenSource

Type: ULONG

The length in characters of the source text from which the current chunk was derived. A zero value signifies character-by-character correspondence between the source text and the derived text. A nonzero value means that no such direct correspondence exists.

Remarks

The final three members (idChunkSource, cwcStartSource, and cwcLenSource) are used to describe the source of a derived chunk; that is, a source that can be mapped back to a section of text. For example, the heading of a chapter can be both a text–type property and an internal value–type property — a heading. The value–type property "heading" would be a derived chunk. If the text of the current value–type chunk (from an internal value–type property) is derived from some text–type chunk, then it must be emitted more than once.

The following segment is an example of how this might happen in a book.

The small detective exclaimed, "C'est fini!"

Confessions

The room was silent for several minutes. After thinking very hard about it, the young woman asked, "But how did you know?"

This segment might be broken into chunks in the following way.

IDTextBreakTypeFlagsLocaleAttribute
1The small deteN/ACHUNK_TEXTENGLISH_UKCONTENT
2ctive exclaimed,CHUNK_NO_ BREAKN/AN/AN/A
3"C'est fini!"CHUNK_EOWCHUNK_TEXTFRENCH_BELGIANCONTENT
4ConfessionsCHUNK_EOCCHUNK_TEXTENGLISH_UKCHAPTER_ NAMES
5ConfessionsCHUNK_EOPCHUNK_TEXTENGLISH_UKCONTENT
6The room was silent for several minutes.CHUNK_EOPCHUNK_TEXTENGLISH_UKCONTENT
7After thinking very hard about it, the young woman asked, "But how did you know?"CHUNK_EOSCHUNK_TEXTENGLISH_UKCONTENT

 

Information provided by idChunkSource, cwcStartSource, and cwcLenSource is useful for a search engine that highlights hits. If the query is done for an internal value–type property, the search engine will highlight the original text from which the text of the internal value–type property has been derived. For instance, in a C++ code filter, the browser, when searching for MyFunction in internal value–type property "function definitions," will highlight the function header in the file.

Requirements

Minimum supported client

Windows 2000 Professional [desktop apps only]

Minimum supported server

Windows 2000 Server [desktop apps only]

Redistributable

the Windows NT 4.0 Option Pack

Header

Filter.h

See also

Reference
CHUNK_BREAKTYPE
CHUNKSTATE
IFilter
GetChunk
GetText
GetValue

 

 

Community Additions

ADD
Show:
© 2015 Microsoft