Document Converters in SharePoint Server 2010 (ECM)

Applies to: SharePoint Server 2010

In this article
Document Conversion Process Overview
Document Converter Scope
Document Conversion File Types
Converting IRM-Protected Files
Conversion Priority

A document converter is a custom executable file that takes a document of one file type, and generates a copy of that file in another file type. For example, a document converter might take a Microsoft Excel 2010 file and use it to generate a file. Using document converters, you can transform your content into different versions to suit your business needs. Perhaps you want to convert draft documentation into a different, final format for long-term archiving; or perhaps you need to convert your internal documentation to a different format for placement on a customer-facing site.

Microsoft SharePoint Server 2010 includes an extensible framework for you to enable your own custom document converters for the document libraries in a Web application. The basic steps to develop a custom document converter are as follows:

  1. Ensure document conversion is enabled for your Web application.

  2. Create an executable file that can be called by using a specific command line command.

  3. Package the executable file, along with a document converter definition file, as a Feature that can be deployed and activated at the Web-application level.

  4. Install and activate the document converter; you can further configure the document converter using the Central Administration user interface.

For more information about the command line command to which document converters must respond, see Document Converter Run Command in SharePoint Server 2010 (ECM).

For more information about deploying document converters, see Document Converter Deployment in SharePoint Server 2010 (ECM).

Document Conversion Process Overview

Because document conversions can be resource intensive, SharePoint Server 2010 relies on two services, DocConvLoadBalancer and DocConvLauncher, to manage the load balancing, prioritizing, and scheduling of the conversions. When a user initiates a document conversion, either through the user interface or object model, SharePoint Server 2010 passes the document conversion request to these two services. It is the DocConvLaunch service that actually calls the document converter. When called, the document converter takes the original file and generates a converted copy. SharePoint Server 2010 then takes the converted copy and performs certain post-processing actions on it. These actions include:

  • Adding the metadata from the original file to the converted copy if the default post-processing for document converters is used. In other cases, such as during smart client authoring, SharePoint Server 2010 uses non-default post-processing to add the metadata from the original file to the converted copy.

    Document converters transfer all fields that are on the old file to the new file as part of default post-processing.

    // Set required properties.
                        foreach (SPField field in file.Item.Fields)
                        {
                            // If we have a field that is not read only,
                            // where the child does not have a value yet
                            // but the parent does, copy that field.
                            if (!field.ReadOnlyField &&
                                !field.Hidden &&
                                newFile.Item[field.InternalName] == null &&
                                file.Item[field.InternalName] != null)
                            {
                                newFile.Item[field.InternalName] = file.Item[field.InternalName];
                                fUpdateNewFile = true;
                            }
                        }
    

    In smart client authoring, no metadata gets transferred to pages. This happens because smart client authoring runs a post-processor that returns false for runDefaultPostProcessing.

    bool runDefaultPostProcessing = true;
                    if (tp != null)
                        tp.PostProcess(etr, cdti, out runDefaultPostProcessing);
    
                    if (runDefaultPostProcessing)
    

    For more information about post-processing in SharePoint Server 2010, see the PostProcess() method.

  • Adding metadata that identifies the original file and document converter used to generate the converted copy.

  • Notifying the specified people that the conversion has been performed.

  • Placing the converted copy into the same document library as the original file.

For more information about the DocConvLoadBalancer and DocConvLaunch services, see Services Necessary for Document Conversion in SharePoint Server 2010 (ECM).

For more information about the command-line command to which document converters must respond, see Document Converter Run Command in SharePoint Server 2010 (ECM).

For more information about the post-processing actions that SharePoint Server 2010 performs on converted copies, see Converted Documents in SharePoint Server 2010 (ECM).

Document Converter Scope

Document converters are enabled at the Web-application level. After a document converter is activated for a Web application, the converter is available for every document library in every site in that Web application.

You cannot disable a document converter for a specific site or document library.

You can also prevent a document converter from being displayed in the user interface. In such a case, the document converter is accessible only through the SharePoint Server 2010 object model. For example, you might have a document converter that is used only by administrators as part of a batch process to archive items. You would not want other users to be able to employ this document converter through the user interface.

Document Conversion File Types

For a Web application, you can have multiple converters that take original documents of the same file type extension, and generate converted copies of the same file type extension. For example, you might have multiple converters that take an Excel file and convert it to a PowerPoint file. Each converter performs different conversion functions on the file, but in each case the final file type extension is the same. Because of this, you can have multiple converted copies of the same file type for the same original document.

SharePoint Server 2010 stores the GUID of the converter used to create each specific converted copy. It uses the GUID, rather than the file type extension, to determine whether a specific converter has been used to generate a converter copy.

Converting IRM-Protected Files

If a document is protected by Information Rights Management (IRM), any converted copy you create will also be IRM protected. If you have a document in an IRM-protected file format, and you select a converter that would generate a converted copy in a file format that is not IRM-protected, the conversion results in an error.

Conversion Priority

You can set the priority for each document conversion to one of three levels. The DocConversionLauncherService service considers document conversion priority when scheduling the order in which document conversion requests are fulfilled.

Following are the priority values.

1 (High)

Default priority for all document conversions initiated through the SharePoint Server 2010 user interface.

2 (Normal)

Default priority for all document conversions initiated using the Windows SharePoint Services object model.

3 (Low)

Recommended priority level for large batches of document conversions. Document conversions can be resource intensive, especially when performed in batches.

You cannot explicitly set document conversion priority through the user interface.

You can explicitly set the priority of a document conversion request using the object model by setting the priority argument of the Convert method.

See Also

Concepts

SharePoint Server 2010 Document Converter Development Overview (ECM)

Custom Processing of Converted Documents in SharePoint Server 2010 (ECM)