Configure FAST Search Server for SharePoint to use a Third-Party IFilter
Published: July 2010
Applies to: Microsoft FAST Search Server 2010 for SharePoint
FAST Search Server 2010 for SharePoint can crawl and index a large number of file types out of the box or by enabling the Advanced Filter Pack. However, some file types require use of a third-party IFilter. If you are using a third-party IFilter, you have to register the IFilter with Windows Search, you have to configure the file user_converter_rules.xml to process the item correctly, and you have to make sure that the file type is not excluded by the FAST Search Content Search Service Application. For more information, see Include a File Type in the Content Index (FAST Search Server 2010 for SharePoint.
This article describes how to configure FAST Search Server 2010 for SharePoint to use a third-party IFilter, as follows:
By default, the file name extension from the content repository is ignored and format detection is based on the raw contents of the item. Then, the item processing pipeline sets the file name extension correctly based on the actual content. The item processing pipeline uses the user_converter_rules.xml configuration file to pass through items with specific file name extensions to the appropriate IFilter. Because format detection is limited to file formats that are supported by default, you must update this configuration for a third-party IFilter to process the item correctly.
You can also use the user_converter_rules.xml configuration file to turn off automatic format detection for certain file type extensions. By doing this, you prevent that content with a specific file type is incorrectly classified based on the actual content of the file. A specific example is when you enable custom XML mapping in the pipeline. Some XML content may not have valid XML declarations and may contain element names that are frequently used in HTML. In this case, the crawled XML items might be mistaken for HTML items. See Custom XML Item Processing for an example of how to apply such a configuration.
To configure FAST Search Server to use a third-party IFilter
Install the custom IFilter on each server in the FAST Search Server 2010 for SharePoint farm. Depending on the installer, this step may automatically include all or parts of the next step.
Register the IFilter with Windows Search as described in Registering Filter Handlers. Most third-party IFilter installers perform this step automatically. However, you should verify that the registry entries are accurate. Follow the steps in Registering Filter Handlers on each server in the FAST Search Server 2010 for SharePoint farm to associate the file type with the third-party IFilter.
Edit %FASTSEARCH%\etc\config_data\DocumentProcessor\formatdetector\user_converter_rules.xml on the FAST Search Server 2010 for SharePoint administration server. You must update the extension, MIME type, and format description that the third-party IFilter supports.
To modify a configuration file, verify that you meet the following requirement: you are a member of the FASTSearchAdministrators local group on the computer where FAST Search Server 2010 for SharePoint is installed.
Any changes that you make to this file will be overwritten and lost if you install a FAST Search Server 2010 for SharePoint update or service pack.
This configuration file is not backed up by the standard FAST Search Server 2010 for SharePoint backup procedure. To avoid losing your changes, ensure that you back up this file after you modify it.
Be sure to reapply your changes to the configuration file after you install a FAST Search Server 2010 for SharePoint update or service pack.
Back up the user_converter_rules.xml configuration file, as this file is not part of the configuration backup/restore process in FAST Search Server 2010 for SharePoint.
On the FAST Search Server 2010 for SharePoint administration server, run the command psctrl reset to reset all currently running item processors in the system.
The user_converter_rules.xml configuration file is read by the item processors on startup and after you run the command psctrl reset.
The following is the basic structure of the user_converter_rules.xml.
<ConverterRules> <IFilter> <trust> <ext name='extensionName' mimetype='mimeType' /> </trust> </IFilter> <MimeMapping> <mime type='mimeType' /> </MimeMapping> </ConverterRules>
For information about the XML syntax, see Item Conversion Rules Schema.
The following example passes .mp3 format files to the IFilter framework.
<ConverterRules> <IFilter> <trust> <ext name=".mp3" mimetype="audio/mpeg" /> </trust> </IFilter> <MimeMapping> <mime type="audio/mpeg">MPEG Audio</mime> </MimeMapping> </ConverterRules>
When this configuration is deployed, items with file name extension .mp3 are forwarded to the third-party IFilter that is registered with Windows Search for that extension. The MIME type is set to audio/mpeg, and the managed property named format will contain the string "MPEG Audio".