Property Filtering
Properties are extracted from documents by filters implemented for specific document types. Some value-type properties are obtained by other means; for example, by the property-storage interfaces. The implementer of a custom IFilter interface can interpret the contents of a document type in any number of ways, and the descriptions here represent "best practices" for an implementation.
The IFilter interface contains several methods that Indexing Service uses when filtering a document. The methods include:
- IFilter::Init, which returns the IFILTER_FLAGS enumeration. If the IFILTER_FLAGS_OLE_PROPERTIES member of this enumeration is set to one, Indexing Service uses the IPropertySetStorage and IPropertyStorage interfaces to enumerate and access external value-type properties.
- IFilter::GetChunk, which returns information from a document in "chunks" with chunk type (text or value), name, and locale. A chunk contains one document property.
- IFilter::GetText, which gets a text-type property from a chunk.
- IFilter::GetValue, which gets a value-type property from a chunk.
The following figure graphically represents an example document. The external value-type property DocTitle (obtained using methods of the IPropertySetStorage and IPropertyStorage interfaces) and the internal value-type property Book (obtained as a result of a custom IFilter implementation) describe the document as a whole. The text-type properties Contents and Chapter describe the content of the document. When processing this document, the IFilter implementation identifies and extracts these properties.
