Search Customization and Development Options in SharePoint Server 2007

Summary: Examine the powerful platform of Microsoft SharePoint Server 2007 search for the end user and developer. Learn about the built-in Web Parts for search and writing search queries for your applications. (23 printed pages)

MVP Icon  Robert L. Bogue, Thor Projects

January 2010

Applies to: Microsoft Office SharePoint Server 2007

Contents

  • Understanding SharePoint Search Architecture

  • Search Types

  • Search Concepts

  • Search Components

  • Customizing the Search User Interface

  • Developing for Search

  • Conclusion

  • Additional Resources

Understanding SharePoint Search Architecture

Microsoft SharePoint search refers to an entire set of technologies that work together to help users find information. Search consists of two distinct systems for searching, which together with the API provide a single experience. The process for creating these two distinct systems is the result of a set of architectural layers and a set of services and infrastructure. Before you can customize SharePoint Server search, you need to understand how search works.

Search Types

Microsoft Office SharePoint Server 2007 performs full-text indexing, reaching into documents and extracting the document's information for querying later. The details of how SharePoint Server performs this indexing are addressed in the next section, Search Concepts.

In addition to the expected full-text indexing, a separate infrastructure operates at the same time on the metadata in the documents that are indexed. During the process of getting all of the text out of a document, certain pieces of the document are handled as special metadata, or data about the data. Metadata can be the date the document was created, the last time it was modified, who the author is, or what the title is. It might also be a custom piece of metadata such as a custom property in a Microsoft Office Word document.

Documents of all types have metadata. HTML documents have metadata in the form of <META> tags. In addition, Word documents and many other document file formats have custom properties.

During the crawling process, SharePoint Server search notes each of the custom properties it finds to create a list of what are known as crawled properties. These properties are not actually indexed except in the full-text index for the document, until they are mapped to a managed property. Each crawled property exists in a hierarchy that includes where the property was found in a document. So, for example, there is a crawled property for Web page titles and a crawled property for Word documents. Although both properties contain titles, they exist separately in the crawled properties because the crawler finds them in different locations in the document.

Managed properties are an aggregation of one or more crawled properties and are created through the Web user interface on the Shared Services Provider (SSP). SharePoint Server includes several managed properties for items such as title and author. These are mapped to one or more crawled properties so that users can search for one managed property and have SharePoint Server try to find multiple kinds of documents. (To learn how to create managed properties, see the white paper Managing Enterprise Metadata with Content Types.)

Managed properties work well, but you must consider carefully which properties are defined as managed properties. Unlike the full text of the document, which is indexed into a specific file on the file system of the index server and query server, managed properties are stored in the search database with the history of the crawl operations. This database is one of the busiest components in a SharePoint environment, and each managed property adds to the data recorded in this database.

As a result, adding managed properties creates a larger search database and even more I/O on it. If your search configuration is already struggling to keep up, you might find you need to delay the addition of managed properties until you can address the performance issues with search.

The good news, from the user perspective, is that SharePoint Server search transparently merges the results of a search from both the managed properties and the full text to generate the results of a query. End users do not need to know whether the search they have constructed is using the SharePoint Server search database on the computer running Microsoft SQL Server or the full-text index on the query server.

Conversely, if a user wants to query for specific managed properties, he or she can type them directly into any search box with the name of the managed property, a colon, and the value to look for in the property. This enables the user to fine-tune a full-text query by looking for a value in a managed property. For example, let's say that a user is looking for a document on search customization by Robert Bogue. The user could enter the query "search customization author:bogue" to search for documents that include "bogue" in the author managed property, and that contain both the words "search" and "customization" in the document.

Search Concepts

I have already discussed how crawled properties are aggregated into managed properties that the user (and developer) works with. This is, however, the output of a fairly extensive process that involves gathering the information first. Gathering the information is the job of protocol handlers and IFilters. After properties are gathered and defined as managed properties, the search administrator can use them to create search scopes. Search scopes are a predefined list of attributes that define a collection of items.

Search Scopes

You must explicitly define the managed properties that you want to use in scopes by using a check box in the user interface. Managed properties for use in scopes have additional optimizations so that users see the fastest response times possible. However, other than these additional optimizations, the managed properties that drive scopes are no different from other managed properties.

There are three kinds of scope:

  • Globally defined

  • Site collection defined

  • Dynamic

The first two kinds of scope are fairly easy to configure. They differ only in where they are defined. Globally defined search scopes are defined in the SSP, and are available to any site collection that uses the SSP. Site collection scopes are defined as a part of the site collection administration and are available only when searching from inside that site collection.

However, the third kind, the dynamic scope, differs because it is not controlled by the administrator. Dynamic scopes are added automatically via the system. Dynamically created scopes are the This Site and This List options that appear for searching. These use a special managed property named site. Site is actually the URL of the item as it was crawled (or transformed via server name mappings).

All of these options for specifying where to search are good, but each option still requires that the information must first be crawled. That process starts with the content source and rapidly gets handed over to a protocol handler.

Protocol Handlers

The process of crawling content begins with SharePoint Server search identifying the content sources. You can think of content sources as the list of starting addresses for the crawl. After SharePoint Server search has a starting address, it tries to determine how to get the content referred to in the URL. Most commonly the address starts with the http://, https://, or file:// protocol identifiers. The protocol handlers for these protocols are built into SharePoint Server, as is a protocol handler for Business Data Catalog (BDC) data (which is discussed in more detail later in this article.)

The protocol handler's role in the crawling process is to return the data from the target of the URL. Whether the target is a file location, a Web location, or something else, the protocol handler is responsible for retrieving the content. You can write custom protocol handlers to get content from places that are not directly HTTP or file locations. For example, the Business Data Catalog protocol handler uses the Business Data Catalog application definitions to get information from databases or from Web services. You can write other custom protocol handlers to get data from specific back-end systems. This was common in previous versions of SharePoint, but has become less common because of the inclusion of the Business Data Catalog.

The protocol handler is also responsible for enumeration of additional links. For a file directory, the protocol handler must start by returning a list of the items in the directory. For a Web page, the protocol handler must start by returning a list of the links to which the page refers.

IFilters

The protocol handler delivers the content, but it does not attempt to make sense of the file. If you have ever opened a Word 2003 file with a text editor, you know why; the raw data of a file can look radically different from the text in the document. The responsibility of parsing the raw data stream into a set of text and properties falls on an IFilter.

When SharePoint Server gets the file data back from the protocol handler, it tries to identify the type of the file. A set of mappings in the registry connects file types to the IFilters used to parse them.

IFilters are important because it is impossible to index content without them. By default, SharePoint Server provides support for most of the Microsoft Office file types, and the Microsoft IFilter Pack (2007 Office System Converter: Microsoft Filter Pack) adds the others. However, file formats such as Adobe Acrobat PDF files and others are not supported in either SharePoint Server or the Microsoft IFilter Pack. A PDF IFilter is available as a free download from Adobe (Adobe PDF IFilter v6.0), and for a small fee from Foxit (Foxit PDF IFilter).

You should also know that IFilters are responsible for emitting the properties of a document, and not all of them do. For example, the PDF file format enables custom properties to be embedded in the PDF document; however, the Adobe PDF IFilter does not yet emit custom properties. Therefore, when designing solutions, you must ensure that the file format you want to use supports custom properties and that the IFilter you are using for the file format can emit the custom properties.

The Business Data Catalog (BDC) feature in Office SharePoint Server 2007 Enterprise enables SharePoint Server to connect to information in line-of-business (LOB) systems on other servers and platforms. In addition to the ability to display entities from the back-end system on profile pages, SharePoint Server also enables you to search those back-end systems.

The search crawler can be pointed at one or more BDC entities and can index the fields returned in the BDC entity definition. Then, these entities can be searched via the full-text search or, if mapped, by using property searches.

The key benefit of the ability to use the BDC for searching is that now you can integrate the structured information on your LOB systems with the unstructured information in your documents. This enables searches to find information anywhere in the organization.

Search Components

Now that you have an architectural understanding of how the major parts of search fit from a software perspective, you need to understand the indexing process from the perspective of how and when you can expect content to show up in the index. In addition, you need to know about the tools that SharePoint Server provides to return that content.

The Crawling Process

In the discussion of the search components, I identified that the process starts with a content source that hands off URLs to the protocol handler to get the information, and then the content is passed to an IFilter for parsing. However, you also need to know which part of the infrastructure is responsible for this process, what can affect it, and what its limitations are.

There are two key server roles for search: the index server, which is responsible for indexing, and query servers, which are responsible for responding to user queries. The appropriately named index server is responsible for creating and maintaining the index. It crawls all of the content at the content sources that are specified and then gets that content into the SharePoint Server search database and the full-text index that is on every query server.

The important point here is that when there are multiple query servers, each server keeps a copy of the full content index. The index server propagates the index from itself to all of the query servers. (This process, plus the process of pulling all of the content locally to index it, can quickly overwhelm the network interfaces in the index server.)

The index server can start crawling in response to a schedule, a user request, or a program calling the appropriate API. The schedule can be set to run every five minutes, or once every few months. If no schedule is set and no one manually initiates a search crawl, the index server will never crawl new content and information stored in the full-text index and search database can become stale.

SharePoint Server can perform two kinds of crawl: full and incremental. The full crawl, as you might expect, crawls all of the content in the system. The incremental crawl crawls only the changed information. In the case of SharePoint sites, the incremental crawl takes advantage of the change log to identify what has changed in a site. In the case of a file share, directories are enumerated to determine if any files have changed since the last crawl. If you are incrementally crawling SharePoint Server sites, the process is very efficient. If you have to perform incremental crawls against file systems, the process is less efficient. Because of this, it is good practice to keep different types of content in different content sources so that you can individually control the frequency of crawling.

In earlier versions of SharePoint, you had to wait until the crawl finished for the index to be merged and propagated from the index server to the query servers. In Office SharePoint Server 2007 the indexes are continuously propagated during the indexing process.

Despite this dramatic improvement in the way that SharePoint Server search works, there is still the reality that content is not in the search index the moment that it is added to the system. You cannot expect your solutions to find the item via search immediately after it is added. It can take time to appear in the search index, depending on when the changed item was crawled. Therefore, when designing solutions for SharePoint Server search, it is critical to remember that there is a delay between when the content is added and when it is available to be searched.

The good news for developing, testing, and diagnosing problems with a system is that you can initiate an incremental crawl manually at any time and monitor its progress through the search crawl logs.

Search Web Parts

Although the search results page might appear to be the output of a single monolithic Web Part, in reality it is a well-coordinated ballet of Web Parts that communicate with one another to display the results of a search. Figure 1 shows the default search results page in edit mode. On this page, there are 11 instances of 9 different Web Parts. In this section, we review the role of each of these Web Parts and how they fit together.

Figure 1. Default search results page in edit mode

Default search results page in edit mode

Search Core Results

The primary Web Part of the page, Search Core Results extracts the query string parameters and executes the query. It also provides all of the query details to the other Web Parts on the page. With the exception of federated results, none of the Web Parts will function without a Search Core Results Web Part on the page.

Search Box

This Web Part is unique for two reasons. First, it can exist on a page that is not a search results page, such as a search query page. Second, it is not the only control used to capture user queries. A delegate control is also used for the search box, which by default is displayed in the top-right corner of almost every page.

Search Summary

This Web Part remains mostly hidden except when search assumes that one or more of the search terms are misspelled. At that point, the Search Summary Web Part uses the Did You Mean feature and displays a "Did you mean" message to suggest a different word to the user.

Note

The values for the Did You Mean feature are based entirely on the crawled content in the search database and cannot be changed.

Search Statistics

This Web Part reports the number of results displayed on the page, the estimated number in the overall search, and the amount of time that the query took. The number of results is only an estimate because it is prohibitively expensive from a resource perspective to do security trimming on large result sets. As a result, SharePoint Server performs result trimming only on the range of results that are being displayed.

Search Action Links

This Web Part provides the ability to sort the results in a different order, as well as the ability to establish an alert or to get an RSS feed of search results.

Search High Confidence Results

This Web Part displays the search results that are included because there is an exact match in the predefined HighConfidenceMatching managed property.

Search Best Bets

This Web Part displays the search results that the site administrator has defined as highly relevant to the user's query. By default, SharePoint Server adds a star to the left of the keyword that is related to a set of best bet results.

Federated Results

Typically, this Web Part is used to display the results from a source other than the local SharePoint server farm search results, although it can show results from the local farm also. The Federated Web Part makes queries by using the OpenSearch 1.1 standard, which SharePoint Server search and many other search engines support. The default configuration on a search results page is to search Bing, but it can be configured to search other farms in the organization, other search platforms in the organization, or other public search engines. The inclusion of this Web Part means that it is not necessary to locally crawl content on the SharePoint farm to display the items in the results, if there is a server crawling the content somewhere.

Top Federated Results

This Web Part displays the results from the first federated location to return search results. You can configure multiple locations for the Web Part in priority order. By default, there are no locations configured for this Web Part.

Search Paging

The Search Paging Web Part provides the page number display and navigation between different pages in the search results. This Web Part emits the set of links to each of the next few pages and a next page link.

Customizing the Search User Interface

The search features of SharePoint Server make it easy to customize the search user interface. This section describes the options that do not require programming for customizing the search experience. The easiest way to change the search interface is to change the arrangement and properties of the built-in search Web Parts. Another simple way to customize the search interface is to use the query string processing that the Search Core Results Web Part performs by default. Finally, you can customize the appearance of results by customizing the XSLT transformation that the search Web Parts are using. I address each of these techniques in the following sections.

Customizing Search Properties

Each Web Part has a set of unique properties that enable it to be configured to fit a variety of scenarios. Table 1 describes the properties that are exposed through the Editor tool pane and how those properties affect the output of the Web Part.

Table 1. Search Web Part properties

Web Part Name

Property Category

Property Name

Property Type

Comments

Search Box

Scopes Dropdown

Dropdown mode

Choice

Options: Do not show scopes drop-down list; Show scopes drop-down list; Show, and default to 's' URL parameter; Show and default to contextual scope; Show, do not include contextual scopes; Show, do not include contextual scopes, and default to 's' URL parameter.

Dropdown label

String

The text to display before the scopes drop-down list.

Fixed dropdown width (in pixels)

String (Integer)

The fixed width of the scopes drop-down list.

Query Text Box

Query text box label

String

The text to display before the query text box.

Query text box width (in pixels)

String (Integer)

The fixed width of the query text box.

Additional query terms

String

A set of additional keywords that the search box automatically adds to the user's entry.

Additional query description label

String

Query box prompt string

String

The initial value in the box when the page is displayed.

Append additional terms to query

Boolean

Indicates whether the additional terms are visible to the user.

Miscellaneous

Search button image URL

String (URL)

A URL to the image to use instead of text for a button.

Use site level defaults

Boolean

Indicates that the defaults should come from the site collection settings.

Display advanced search link

Boolean

Indicates that the advanced search link should be displayed.

Advanced search page URL

String (URL)

The target URL for the advanced search link.

Target search results page URL

String (URL)

The page to go to for results.

Display submitted search

Boolean

Shows the current search if set. Otherwise, the prompt string is displayed.

Scope display group

String

The name of the predefined ordered scope group to use in the scopes drop-down list, as defined at the site collection level.

Search summary

Query Summary

Display Mode

Choice

Options:

  • Compact

  • Extended

Miscellaneous

Show Messages

Boolean

Causes the Web Part to display error messages when an error occurs.

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Search Statistics

Result Statistics

Display Mode

Choice

Options:

  • One line

  • Two lines

Display number of results on page

Boolean

Indicates whether to display the results estimate.

Display total number of results

Boolean

As mentioned earlier, this is an estimate and for some situations might need to be suppressed.

Display search response time

Boolean

Indicates whether the statistics related to the time to execute the query are returned.

Miscellaneous

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Search Action Links

Search Results Action Links

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Display "Relevance" View Option

Boolean

Indicates whether the sort by relevance option is displayed.

Display "Modified Date" View Option

Boolean

Indicates whether the sort by modified date is displayed.

Display "Alert Me" Link

Boolean

This should be turned off if search alerts are disabled.

Display "RSS" Link

Boolean

Indicates that the RSS link should be displayed

Data View Properties

XSL Editor

Multi-Line

The XSLT used for the transformation is included here.

Search High Confidence Results

Results Display

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Keywords

Display keyword

Boolean

Displays the keyword that was matched.

Display definition

Boolean

Displays the definition of the keyword that was matched. Each keyword has an optional definition (or description).

Best Bets

Display title

Boolean

Displays the title of the best bet.

Display description

Boolean

Displays the description of the best bet.

Display URL

Boolean

Displays the URL of the best bet.

Best Bets limit

String (Integer)

The default value is three items. If a keyword can have more than three best bets, this value must be adjusted.

High Confidence Matches

Display title

Boolean

Displays the title of the high-confidence match.

Display image

Boolean

Displays the image associated with the high-confidence match.

Display description

Boolean

Displays the description associated with the high-confidence match.

Display properties

Boolean

Displays the properties of the high-confidence match.

Maximum matches per High Confidence Type

String (Integer)

The maximum number of results for a high-confidence match.

Miscellaneous

Show Messages

Boolean

Indicates whether to display error messages.

Sample Data

String (XML)

This value has no effect

XSL Link

String (URL)

The URL of the XSLT to use for transforming the results.

Enable Data View Caching

Boolean

Enables caching for the Web Part

Data View Caching Time-out (seconds)

String (Integer)

The default, 86400, is 24 hours. This might need to be shortened if you frequently change best bets.

Send first row to connected Web Parts when the page loads

Boolean

Some connected Web Parts need the first row of data (the schema) when they are first connected. By clearing this option, the Web Part can perform slightly faster.

Data View Properties

XSL Editor

Multi-Line

The XSLT to transform the results, if provided via the XSLT link in the Miscellaneous settings group.

Top Federated Results

Location Properties

Location

Complex

This setting is a set of locations for the Web Part to query. The default is two locations but this can be added to or removed from.

Display Properties

Results Per Page

String (Integer)

By default this is set to one item. This may not be enough for every situation.

Limit Characters In Summary

Boolean

Indicates whether to limit the summary length.

Characters in Summary

String (Integer)

The maximum number of characters to display (if enabled).

Limit Characters in URL

Boolean

Indicates whether to limit the URL length (as displayed).

Characters in URL

String (Integer)

The maximum number of characters to display in the URL.

Use Location Visualization

Boolean

Selects whether the link to the XSLT or the embedded XSLT is used.

Fetched Properties

String (XML)

A list of the properties to return.

XSL Editor

Multi-Line

The XSLT to transform the results.

Parameters Editor

Complex

Result Query Options

Remove Duplicate Results

Boolean

Indicates that duplicate results should be merged by the query engine.

Enable Search Term Stemming

Boolean

Turns on word stemming support for the search terms. This is different from wildcarding.

Ignore Noise Words

Boolean

Specifies whether noise words are removed from the query.

Fixed Keyword Query

String

Specifies to fix the search to specific criteria instead of by using the query string.

Append Text to Query

String

Words that can be added to the user's query. This can be used to provide some additional keywords based on context that the user may not think to enter.

More Result Link Options

Show More Results Link

Boolean

Shows the more results link.

More Results Text Link Text Label

String

The label of the more results link.

Miscellaneous

Show messages

Boolean

Indicates that the Web Part should display error messages if an error occurs.

Sample Data

String (XML)

This setting has no effect.

XSL Link

String (URL)

A link to the XSLT to use to transform the results.

Enable Data View Caching

Boolean

Enables caching for the Web Part.

Data View Caching Time-out (seconds)

String (Integer)

The default, 86400, is 24 hours. This might need to be shortened if you frequently change best bets.

Send first row to connected Web Parts when Page Loads

Boolean

Some connected Web Parts need the first row of data (the schema) when they are first connected. By clearing this option, the Web Part can perform slightly faster.

Search Core Results

Results Display/Views

Results Per Page

String (Integer)

The maximum number of results to display per page.

Sentences in Summary

String (Integer)

The maximum number of sentences to include in the summary.

Highest Result Page

String (Integer)

The maximum number of results pages allowed.

Default Results View

Choice

Options:

  • Relevance

  • Modified Date

Display Discovered Definition

Boolean

Indicates whether discovered definitions are displayed in the search results.

Results Query Options

Remove Duplicate Results

Boolean

Indicates that duplicate results should be merged by the query engine.

Enable Search Term Stemming

Boolean

Indicates that the query engine should perform stemming. This is different from wildcarding.

Permit Noise Word Queries

Boolean

Indicates that the query engine should allow words which are in the noise words file and thus should not be included in the index.

Selected Columns

String (XML)

An XML fragment containing all of the managed properties to return in the search results.

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Fixed Keyword Query

Fixed Keyword Query

String

A set of words to query on instead of using the query string.

More Results Link Text Label

String

The label for the more results link.

More Results Link Target Results Page URL

String (URL)

The URL for the page that will contain more results.

Miscellaneous

Scope

String

The hard-coded scope to use for the query

Show Messages

Boolean

Indicates that the Web Part should show error messages if an error occurs.

Show Search Results

Boolean

Determines whether results are displayed. Because this Web Part is necessary for others, this option allows the output to be disabled so other Web Parts can use the results.

Show Action Links

Boolean

Indicates whether to display the action links.

Display "Relevance" View Option

Boolean

Indicates whether the users should be allowed to sort by relevance.

Display "Modified Date" View Option

Boolean

Indicates whether the users should be allowed to sort by modified date.

Display "Alert Me" Link

Boolean

Indicates that the Alert Me link should be displayed so that users can choose a search alert on this query.

Display "RSS" Link

Boolean

Indicates that the RSS link should be displayed so that the users could get an RSS feed for this query.

Sample Data

String (XML)

This has no effect.

XSL Link

String (URL)

The URL to get the XSLT from to transform the results.

Enable Data View Caching

Boolean

Enables caching for the Web Part.

Data View Caching Time-out (seconds)

String (Integer)

The default, 86400, is 24 hours. This might need to be shortened if you frequently change best bets.

Send first row to connected Web Parts when page loads

Boolean

Some connected Web Parts need the first row of data (the schema) when they are first connected. By clearing this option, the Web Part can perform slightly faster.

Data View Properties

XSL Editor

Multi-Line

The XSLT to use to transform the results if included.

Search Paging

Results Paging

Maximum page links before current

String (Integer)

The maximum number of links to display before the current page. The first page link is always displayed.

Maximum page links after current

String (Integer)

The maximum number of links to display after the current page. The last page link is always displayed.

Previous link text label

String

The text of the previous link.

Previous link image URL

String (URL)

The URL for the previous link. If specified, this is used instead of text.

Next link text label

String

The text of the next link.

Next link image URL

String (URL)

The URL for the next link. If specified, this is used instead of the text.

Miscellaneous

Cross-Web Part query ID

Choice

Options:

  • User query

  • Query 2

  • Query 3

  • Query 4

  • Query 5

Federated Results

Location Properties

Location

Choice

Options:

  • None

  • Internet Search Results

  • Internet Search Suggestions

  • Local Search Results

Description

String

A read-only description of the option selected in Location.

Display Properties

Results per Page

String (Integer)

The maximum number of results per page.

Limit Characters in Summary

Boolean

Indicates whether to limit the summary length.

Characters in Summary

String (Integer)

The maximum number of characters to display (if enabled).

Limit Characters in URL

Boolean

Indicates whether the URL length (as displayed) should be limited.

Characters in URL

String (Integer)

The maximum number of characters in the URL to display.

Use Location Visualization

Boolean

Selects whether the link to the XSLT or the embedded XSLT is used.

Fetched Properties

String (XML)

A list of the properties to return.

XSL Editor

Multi-Line

The XSLT to transform the results.

Parameters Editor

Popup

Retrieve Results Asynchronously

Boolean

Indicates that the Web Part should return to allow the page to display and then display the results after they are available.

Show Loading Image

Boolean

Indicates that a loading image should be displayed while results are being returned.

Loading Image URL

String (URL)

The URL for the loading image to display while results are being returned.

Results Query Options

Remove Duplicate Results

Boolean

Indicates that duplicate results should be merged by the query engine.

Enable Search Term Stemming

Boolean

Indicates that the query engine should perform stemming. This is different than wildcarding.

Ignore Noise Words

Boolean

Indicates that the query engine should allow words which are in the noise words file and thus should not be included in the index.

Fixed Keyword Query

String

The keywords to use for the search instead of the values specified in the query string.

Append Text to Query

String

The additional query terms to add to what were provided via the query string.

More Results Link Options

Show More Results Link

Boolean

Indicates that the Show more results link should be displayed.

More results Link Text Label

String

The text of the more results label.

Miscellaneous

Show Messages

Boolean

Indicates that the Web Part should display error messages if an error occurs.

Sample Data

String (XML)

This setting has no effect.

XSL Link

String (URL)

A link to the XSLT to use to transform the results.

Enable Data View Caching

Boolean

Enables caching for the Web Part.

Data View Caching Time-out (seconds)

String (Integer)

The default, 86400, is 24 hours. This might need to be shortened if you frequently change best bets.

Send first row to connected Web Parts when page loads

Boolean

Some connected Web Parts need the first row of data (the schema) when they are first connected. By clearing this option, the Web Part can perform slightly faster.

This list is, of course, extensive. You can accomplish many of the goals for creating user interfaces by manipulating these properties. However, sometimes the challenges with the user interfaces are not in the display of results; they are in creating user friendly search query pages where users can express their search criteria in ways that make sense to them. The easiest way to customize the query experience is to use the Search Core Results Web Part functionality to process any query that is encoded in the query string.

Using Search Core Results and the Query String

Although you can create your Web Parts to use the SharePoint API to return a set of values, this process is not always necessary. Sometimes it is possible to encode the values for the search on the query string and use the Search Core Results Web Part to perform the actual searching. This is a quick and easy way to get a better search experience for the user without writing much code.

Before getting to the actual parameters, it is important to review the fact that when the user enters a query, he or she can specify a managed property or metadata search by including the name of the property, a colon, and the value to look for. This process works because the Search Core Results Web Part processes the properties as a part of the keywords coming across in the query string. In fact, one of the best ways to see how the Search Core Results Web Part responds to query parameters is to do tests from the user interface with various settings.

Table 2 shows the basic parameters that the Search Core Results Web Part will process.

Table 2. Search Core Results Query string parameters

Parameter

Description

K

Search keywords. This should be the URL-encoded value that the user typed in, including the full-text query and any managed properties.

S

The scope to query. This is the scope from the scopes drop-down list. To specify multiple scopes to query, you can separate them with a comma (which when URL-encoded become %2c).

U

The beginning of the site property. This is used primarily with the contextual scopes, but can be useful if you want to constrain the results to a particular site or area, and do not want to define a scope.

V

The default ordering (or view). The valid options are relevance and date.

Start

The page number to show. This option is useful if you want to display the initial set of results and turn the display over to Search Core Results and a search results page after the first set of results.

A common request is for a user interface for searching properties that is better than the advanced search page. (To learn how to update the advanced search page, see Managing Enterprise Metadata with Content Types). You can accomplish this by creating a Web Part that displays the managed properties, including any drop-down lists or other advanced controls, and then redirecting the user with the correct query string parameters to a search results page.

Customizing the Search Results XSLT

The Search Core Results Web Part uses a built-in XSLT template, which converts the raw data format that is returned from the query into a set of HTML. The basic format of the raw results is relatively unchanged by different configurations. It is influenced only by the Selected Columns property in the Result Query Options category. You can see the actual output of the query by going to the Data View Properties section and clicking the XSL Editor. In the box that appears, replace all of the text with the following listing.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
  <xsl:output method="xml" version="1.0" encoding="UTF-8" 
     indent="yes" />
  <xsl:template match="/">
  <xmp><xsl:copy-of select="*" /></xmp>  
  </xsl:template>
</xsl:stylesheet>

This XSLT will output exactly what was provided in the source data, so that you can see exactly what was returned from the search API. This can be copied from the browser into your favorite XSL editing tool to use as sample data to be able to test the results of your transformation. It is beyond the scope of this article to show you how to write XSLT transformations; however, I can demonstrate a simple way that you can modify the built-in XSLT to include additional properties.

The default template uses the template named "Result" to process each result. Within this template there are three basic sections: Title, Description, and Metadata. The final section of the results, the metadata section, is where I will focus your attention. This section is responsible for displaying the size, author, and last-modified date. It does this by calling two XSLT templates: DisplaySize for the size and DisplayString for the author and last-modified properties. The DisplayString template looks like the following in the default XSLT.

<xsl:template name="DisplayString">
  <xsl:param name="str" />
  <xsl:if test='string-length($str) &gt; 0'>   
   - 
   <xsl:value-of select="$str" /> 
  </xsl:if>
</xsl:template>

In this XSLT, you will enhance this template to support the addition of a label to the string. You do this because as you get more properties that you want to add to the metadata properties section, it will be hard to tell them apart. This is because they can be null and should not be output. A modified template appears as follows.

<xsl:template name="DisplayString">
  <xsl:param name="str" />
  <xsl:param name="label" />
  <xsl:if test='string-length($str) &gt; 0'>   
   - 
     <xsl:value-of select="$label" />
     <xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text><xsl:value-of select="$str" /> 
  </xsl:if>
</xsl:template>

This display the label followed by a nonbreaking space and then the value. The only remaining action is to add more metadata items to the metadata section of the "Result" template. This section begins with <p class="srch-metadata">. By adding calls to the template, you can add metadata to the search results. For example, the following code adds the content type name to the display and a custom managed property named owningdepartment.

     <xsl:call-template name="DisplayString">
        <xsl:with-param name="str" select="contenttype"  /> 
     </xsl:call-template>
     <xsl:call-template name="DisplayString">
      <xsl:with-param name="str" select="owningdepartment"/>
      <xsl:with-param name="label" select="'Owning Department'" />
     </xsl:call-template>

The final step is to add the columns to the result by changing the Results Query Options - Selected Columns by adding two columns before the closing </Columns> tag. Following are those two columns.

<Column Name="contenttype" />

<Column Name="owningdepartment" />

Figure 2 shows the result of these changes to the XSLT.

Figure 2. Modified search results

Modified search results

Developing for Search

Despite the flexibility of the built-in search components, there are times when you need to develop your own search components. There are two approaches to searching SharePoint sites. The first approach is to use the Query Web service. The second approach, described here, is to use the Query object model API. In most cases, you will use the Query object model API because it has support for the SQL full-text query syntax and it eliminates the issues related to authenticating the Web service request.

Even within the object model, there are two basic types of search that you can do with the SharePoint API. The first is the keyword query and the second is the full-text SQL query. There are two main differences between the keyword search and the full-text SQL query. First, the query text for the full-text SQL query is a SQL statement that includes a full SQL syntax. This query text specifies both the return values and the criteria, whereas the keyword query specifies only the query criteria. Second, the keyword query requires that you specify any properties you want to bring back by using a SelectProperties property on the keywordQuery object.

The following code shows a simple Web Part that will perform either a keyword or a full-text SQL query and display the results in an SPGridView control.

using System;
using System.Data;
using System.Runtime.InteropServices;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Serialization;

using Microsoft.SharePoint;
using Microsoft.SharePoint.WebControls;
using Microsoft.SharePoint.WebPartPages;
using Microsoft.Office.Server;
using Microsoft.Office.Server.Search.Query;

namespace InSSearch
{
    [Guid("fb178c07-14c3-444a-a184-0541a3ba71ec")]
    public class InSSearch : System.Web.UI.WebControls.WebParts.WebPart
    {
        public enum QueryType
        {
            Keyword = 0,
            SQL = 1
        }

        public InSSearch()
        {
        }

        protected TextBox txtSearchBox = new TextBox();
        protected Button btnSearch = new Button();
        protected SPGridView grdResults = new SPGridView();
        protected RadioButtonList rblSearchType = new RadioButtonList();

        protected override void OnInit(EventArgs e)
        {
            EnsureChildControls();
            base.OnInit(e);
        }

        protected override void CreateChildControls()
        {
            base.CreateChildControls();

            rblSearchType.Items.Add(new ListItem("Keyword Search", "Keyword"));
            rblSearchType.Items.Add(new ListItem("SQL Search", "SQL"));
            rblSearchType.SelectedIndex = 0;

            btnSearch.Text = "Search";
            btnSearch.Click += new EventHandler(btnSearch_Click);

            Controls.Add(new LiteralControl("Search for ")); 
            Controls.Add(txtSearchBox); 
            Controls.Add(btnSearch);
            Controls.Add(rblSearchType);
            Controls.Add(new LiteralControl("<BR/>"));
        }

        void btnSearch_Click(object sender, EventArgs e)
        {
            QueryType queryType = (QueryType) Enum.Parse(typeof(QueryType), 
            rblSearchType.SelectedValue);

            string searchString = txtSearchBox.Text;
            using (Query query = GetQuery(queryType, searchString))
            {
                
                query.StartRow = 0;
                query.RowLimit = 200;
                query.TrimDuplicates = true;
                query.ResultTypes = ResultType.RelevantResults;

                ResultTableCollection results = null;
                ResultTable relevantResults = null;
                try
                {
                    results = query.Execute();
                    grdResults.AutoGenerateColumns = false;
                    BoundField bfTitle = new BoundField();
                    bfTitle.DataField = "Title";
                    bfTitle.HeaderText = "Title";
                    BoundField bfAuthor = new BoundField();
                    bfAuthor.DataField = "Author";
                    bfAuthor.HeaderText = "Author";
                    BoundField bfSiteName = new BoundField();
                    bfSiteName.DataField = "SiteName";
                    bfSiteName.HeaderText = "Site Name";
                    BoundField bfPath = new BoundField();
                    bfPath.DataField = "Path";
                    bfPath.HeaderText = "Url";

                    grdResults.Columns.Add(bfTitle); grdResults.Columns.Add(bfAuthor); 
                    grdResults.Columns.Add(bfSiteName); grdResults.Columns.Add(bfPath);
                    relevantResults = results[ResultType.RelevantResults];
                    DataTable tbl = new DataTable();
                    tbl.Load(relevantResults, LoadOption.OverwriteChanges);
                    grdResults.DataSource = tbl;
                    grdResults.DataBind();

                    Controls.Add(grdResults);

                }
                catch (Exception excpt)
                {
                    Controls.Add(new LiteralControl(excpt.ToString()));
                }
            }
        }

        public Query GetQuery(QueryType qt, string searchTerms)
        {
            switch (qt)
            {
                case QueryType.Keyword: return (GetKeywordQuery(searchTerms));
                case QueryType.SQL: return (GetSQLQuery(searchTerms));
                default:
                    return null;
            }
        }

        public Query GetKeywordQuery(string searchTerms)
        {
            KeywordQuery keywordQuery = null;
            try
            {
                keywordQuery = new KeywordQuery(ServerContext.Current);
                keywordQuery.QueryText = searchTerms;
                keywordQuery.SelectProperties.Add("title");
                keywordQuery.SelectProperties.Add("author");
                keywordQuery.SelectProperties.Add("sitename");
                keywordQuery.SelectProperties.Add("path");
            }
            catch
            {
                if (keywordQuery != null) { keywordQuery.Dispose(); keywordQuery = null; }
            }

            return keywordQuery;
        }

        public Query GetSQLQuery(string searchTerms)
        {
            FullTextSqlQuery ftQuery = null;
            const string qryText = "SELECT title, author, sitename, path " +
                             "FROM Scope() " +
                             "WHERE CONTAINS('\"{0}\"')";
            try
            {
                ftQuery = new FullTextSqlQuery(ServerContext.Current);
                string fullQuery = string.Format(qryText, searchTerms);
                ftQuery.QueryText = fullQuery;
            }
            catch
            {
                if (ftQuery != null) { ftQuery.Dispose(); ftQuery = null; }
            }

            return ftQuery;
        }
    }
}

SharePoint Server does record the queries that are executed on the platform, but only record queries that are sent to Search Results Core. SharePoint Server will not track searches that are implemented in your code.

In the code, notice that there are a set of using and try/catch blocks to ensure that the Query object is disposed. Query, like SPSite and SPWeb, is a disposable object that you should dispose correctly.

Conclusion

SharePoint Server search is a powerful platform that uses replaceable components to enable it to expand to new content sources and file types. You can use it easily by taking advantage of built-in components and by adapting the configuration. You can use SharePoint Server search by creating your own Web Parts that drive the Search Core Results Web Part through query string parameters, or you can write search Web Parts that perform searching through the rich API.

Additional Resources

For more information, see the following resources: