Export (0) Print
Expand All

Enhancing the BDC model file for Search in SharePoint 2013

Published: July 16, 2012

Learn about the properties in the BDC metadata model that are applicable to BCS indexing connectors which enable Search in SharePoint 2013 to crawl external data.

The connector framework in Search enables you to crawl external data, making it available in search results through BCS indexing connectors. The BCS indexing connector is used by the crawler to communicate with the external data source. At crawl time, the crawler calls the BCS indexing connector to fetch the data from the external system and pass it back to the crawler.

BCS indexing connectors are composed of the following:

  • BDC model file  The file that provides the connection information to the external system and the structure of the data.

  • Connector  The component containing the code that connects to the external system and parses the access URLs and BCS identifiers.

The BDC metadata model includes several properties that are applicable to Search, many of which are required to support BCS indexing connector crawling.

The following table describes the BDC model properties that are applicable to Search.

Table 1. Search properties for BDC model files

Name

Metadata Object

Description

ShowInSearchUI

Model

Specifies that an LobSystemInstance element in the model file should be displayed in the search user interface. This value is ignored for custom connectors.

InputUriProcessor

LobSystem

Specifies the name of the class that processes the input URL before passing it to the connector. Applies to .NET and custom BCS indexing connectors. For more information, see Creating .NET and custom BCS indexing connector components.

OutputUriProcessor

LobSystem

Specifies the name of the class that processes the output URL before passing it to the search system from the connector. Applies to .NET and custom BCS indexing connectors. For more information, see Creating .NET and custom BCS indexing connector components.

SystemUtilityTypeName

LobSystem

Specifies the name of the class that implements the StructuredRepositorySystemUtility class. Applies to custom BCS indexing connectors. For more information, see Creating .NET and custom BCS indexing connector components.

Title

Entity

Specifies the title of the external content type to display in search results.

DefaultLocale

Entity

Specifies the locale string. You can override this value by using the LCIDField property or the CultureField property.

RootFinder

Method

Specifies the Finder method to use to enumerate the items to crawl. For example, when connecting to a database, this could be the SELECT statement or the list of tables to crawl.

DirectoryLink

Method

Specifies that BCS should navigate associations. Required for hierarchical crawling.

DeletedCountField

Method

Specifies the deleted count value. This property is ignored unless it contains an integer greater than zero.

WindowsSecurityDescriptorField

Method

Specifies the Windows Security descriptor for the item. If not specified, the GetSecurityDescriptor method is called. If the GetSecurityDescriptor is not defined, all external items are assigned the Everyone access control list (ACL).

AuthorField

Method

Specifies the author name to display in search results.

DisplayUriField

Method

Specifies the URL to display in search results. If specified, this property overrides the profile page URL provided by BCS. If not specified, the URL displayed in search results starts with bdc3://, and is not understood by the browser.

LastModifiedTimeStampField

Method

Specifies the external item's timestamp to display in search results. This value is also used for incremental crawling.

DescriptionField

Method

Specifies the description to display in search results.

LCIDField

Method

Specifies the locale ID (LCID) for the DescriptionField. If this is not specified, the default word breaker is used.

CultureField

Method

Specifies the culture for the DescriptionField.

Extension

Method

Specifies the file name extension for the crawlable stream. If not specified, the default extension is .txt.

MimeType

Method

Specifies the MIME type for the crawlable stream. If not specified, the default extension is .txt. If the Extension field and MimeType field are both specified, the value specified in the MimeType field is used.

UseClientCachingForSearch

Method

Specifies whether the crawler caches the content during enumeration. If the content is cached, the crawler does not make another trip to the content source when it crawls individual items.

EnumerateIdsOnly

FilterDescriptor

Specifies whether to return IDs only in the IDEnumerator.

CrawlStartTime

FilterDescriptor

Contains the start time of the last crawl.

SynchronizationCookie

FilterDescriptor

Specifies that the external content source returns a cookie after a crawl, which is then resent by the indexing connector during the next enumeration call. The external content source uses the cookie to determine what has changed since the last crawl. This property is used with ChangedIDEnumerator and DeletedIDEnumerator method instances.

Property

TypeDescriptor

Specifies the struct array used by search for properties. Consists of the following:

  • PropertyName

  • PropertyValue

  • PropertyCulture

Text

TypeDescriptor

Specifies the struct array used by search for attachments. Consists of the following:

  • TextExtension

  • TextContentType

  • TextValue

When you want to create a BDC model file for an external system that you want to enable for search, you can enhance the model file to optimize performance when crawling external systems. This section describes ways to modify the BDC model file to improve performance.

Use inline property I/O when retrieving large-scale data

In general, if some of the data returned for an item is large scale, instead of returning it with the SpecificFinder method, you should use one of the following specialized methods to retrieve the data:

  • Use the BinarySecurityDescriptorAccessor method when passing a security access control list (ACL) instead of the WindowsSecurityDescriptor property.

  • Use the StreamAccessor method when passing streams.

Unless network latency is high, the improved performance is usually better than the cost of an extra trip to the external system.

Enumeration optimization when crawling external systems

Do not enumerate more than 100,000 items per call to the external system. Long-running enumerations can cause intermittent interruptions and prevent a crawl from completing. We recommend that your BDC model structures the data into logical folders that can be enumerated individually, as shown in the following example.

This example demonstrates enumerating against a database table with one million rows, but with a fixed set of values in ColumnA. In this scenario, you can consider ColumnA as the external content type and write an enumerator for this set of values by using the following SQL statement.

SELECT DISTINCT( ISNULL(ColumnA,'unknown')) as ColumnA  FROM table

Next, define the specific finder using the following SQL statement.

SELECT DISTINCT( ISNULL(ColumnA,'unknown')) as ColumnA  FROM table where ColumnA = @Value

Finally, you must define the association navigation operation, as follows.

Select * from table where ColumnA=@value

Any method should begin returning results within two minutes, or the crawler will cancel the call. For example, a complex SQL statement that uses a LIKE clause may take longer than two minutes to complete, and would cause the crawler to cancel the call.

Improving crawl speed with the UseClientCachingForSearch property

The UseClientCachingForSearch property improves the speed of full crawls by caching the item during enumeration. Using this property is also recommended when implementing incremental crawls that are based on change logs, because it improves incremental crawl speed.

Important note Important

If items are larger than 30 kilobytes on average, do not set this property, as it will lead to a significant number of cache misses and negate performance gains.

If the repository uses NTLM authentication, we recommend that you specify PassThrough authentication for crawling.

Profile pages may require that you use the Secure Store Service because of the multi-hop delegation problem from the front-end web server. If you encounter this problem, you can optimize the crawl while still allowing profile pages by creating two similar LobSystemInstance instances. The first instance should use credentials from the Secure Store Service authentication. This instance should not contain the ShowInSearchUI property. The second instance should use PassThrough authentication, and should contain the ShowInSearchUI property. Profile pages use the first LobSystemInstance instance, and the crawler uses the second instance.

Note Note

This requires that you set the ShowInSearchUI property at the LobSystemInstance level instead of at the LobSystem level.

Community Additions

ADD
Show:
© 2014 Microsoft