Document IDs and the DocID Service in SharePoint Server 2010 (ECM)
Published: May 2010
The document ID feature creates identifiers that can be used to retrieve items independent of their current location. The document ID service that supports it generates and assigns document IDs. This topic describes document IDs: how they work, and how the service that supports their generation and assignment works.
Typically, users open items by taking an action that is initiated in the Web browser, such as navigating to a URL, which then instructs the Microsoft Office client application to open the file.
The static URL feature can be used in the Web browser to redirect users to the real URL of an item by way of an HTTP redirect or a server transfer call. Either way, it is not intended that the static URL works everywhere that the real URL for an item works. For example, selecting Open on the File menu in client applications does not handle cases where URLs are redirected when the Open method is called.
You can activate, administer, and deactivate the document ID service at the site-collection level. Static URLs work correctly at the site-collection level because the Web browser manages the redirect before it invokes the Office client application. This means that the client application sees only the real URL. When the document ID feature is activated, Microsoft SharePoint Server 2010 adds links to the Site Collection Settings page in the Central Administration user interface (UI) and enables the document ID service, which starts assigning document IDs in the site collection. The document ID service generates document IDs for all documents in the site collection, but it does not generate document IDs for other types of list items. Document IDs are generated every time an item is added, and existing IDs are overwritten by default unless the item that was created specifically instructs SharePoint Server 2010 not to overwrite its existing ID. During move operations, SharePoint Server 2010 keeps the document ID.During copy operations SharePoint Server 2010 assigns a new document ID. You can control this by setting a Boolean operator on the PersistID column.
When a document ID is assigned it is exposed as metadata, and the server exposes a static URL so that the item to which the document ID is assigned can be recognized by its document ID. The static URL attempts to retrieve the item either by searching for it or by looking it up.
Search administrators can configure the search service for looking up document IDs by adding the ID column as a managed search column and optionally creating a new search scope that is used to look up document IDs. SharePoint Server 2010 includes a Windows PowerShell 1.0 command that does this automatically.
Deactivating the feature removes links to the Site Collection Settings page, makes the page that is used to look up document IDs no longer available, turns off the document ID service, and stops assigning document IDs. The server does not remove the columns that it adds at the site level when the feature is first activated so that, even after deactivation, the existing document IDs are preserved. After the feature is deactivated, users who try to use a static URL to look up an item by its document ID see an error message indicating that "This Site Collection is not configured to use document IDs."
When the document ID service is enabled, SharePoint Server 2010 adds new columns to the document content type and the document set content type, which store the document ID and expose the static URL and the event receivers that assign document IDs. The service also includes a work-item job that assigns IDs to all existing items in the site collection. The server adds the following site columns to the site collection in a group named "document ID". Additionally, the site columns are added to the document content type and the document set content type at the site-collection level.
The DocID column stores the document ID that is assigned to the item. It has the following attributes:
Name: Document ID
Description: Used to locate this item independent of its current location.
Static URL Column
The Static URL column presents the URL for the item that is used to look up the document ID. It has the following attributes:
Name: Static URL
Description: Used to retrieve this item, independent of its current location.
The PersistID column is used by the document ID assignment logic to determine whether an existing document ID should be kept or reassigned. This column is hidden, does not render UI, and cannot be included in any view:
Description: Used to specify whether an item's current ID should be kept after it is copied to a new location.
Default value: False
In addition to adding the columns noted above to the document content type and the document set content type, SharePoint Server 2010 adds an event receiver to the appropriate SharePoint Foundation 2010 events so that they run every time that a document or document set is uploaded to SharePoint Foundation 2010. The server uses synchronous event receivers such as ItemAdded(SPItemEventProperties) (not )) to ensure that document ID providers can use item metadata when assigning document IDs.
When items are added to a site collection, SharePoint Server 2010 assigns or reassigns document IDs to them.
When a new item is added, SharePoint Server 2010 first checks to see whether the item has a document ID. If the item has a document ID, the server checks to see whether the PreserveID attribute is set to True or False, and then sets it to False if it is currently set to True. If the item does not already have a document ID, the server gets a document ID for the item from the specified provider, writes it to metadata, and sets the PreserveID attribute to False.
The static URL value is not generated because SharePoint Server 2010 dynamically creates it when the field is rendered and viewed.
The default behavior for assigning document IDs assumes that if an item exists and already has an ID, SharePoint Server 2010 should overwrite that ID with a document ID. This happens when an existing item in SharePoint Foundation 2010 is copied: The copy keeps the same metadata as the original, including the document ID, but it still throws the ItemAdding(SPItemEventProperties) event.
SharePoint Server 2010 does not assume that the item is a copy when an item is added. Instead, it provides a way for custom solutions to be aware that they are implementing a "semantic move", which (from an object-model perspective) is a copy-and-delete operation that is used to override default copy logic and treat items and their associated document IDs as if the object model had completed a move operation on them.
You can use custom code to prevent overwriting document IDs or IDs that were previously assigned to items that are being copied to SharePoint Server 2010 for the first time. For example, you can suspend all events by calling the DisableEventFiring() method in code before you copy. However, this approach is not advisable in cases where other event receivers need to run and the code exists only to preserve IDs.
SharePoint Server 2010 takes a two-part approach when the document ID service looks up document IDs to provide the best balance of document IDs that work immediately and those that work across broad scopes:
Search. Find an item across any location that belongs to the current search scope. Search generally performs better as a cross-list query. However, search is only as reliable as its last index. Therefore, if an item was added but has not yet been indexed by search, it does not appear in search results. Additionally, if an item was moved since the last time it was indexed by search, then the old (and now broken) URL appears in search results.
Lookup specific to the ID provider. When an item cannot be retrieved by using search (for example, if it has not been indexed yet), SharePoint Server 2010 calls back to the document ID provider and allows it to use its own lookup logic. This enables providers who want to use IDs that work before search indexing is run on the last items to look them up. The provider determines whether to perform lookups in this way and what the most effective logic is for doing so.
You can use custom providers to assign document IDs to items. In some organizations, specific item metadata drives how IDs are assigned. This helps ensure that the ID conveys information about the item.
SharePoint Server 2010 supports using custom code plug-ins to provision document IDs. You can author your own custom providers by implementing a class that derives from the IIDProvider interface and then deploying and registering that provider in each site collection. After a custom provider is registered, the document ID service uses the custom provider instead of the default provider.