Export (0) Print
Expand All
9 out of 11 rated this helpful - Rate this topic

Full-Text Search Overview

Full-text queries perform linguistic searches against text data in full-text indexes by operating on words and phrases based on rules of a particular language such as English or Japanese. Full-text queries can include simple words and phrases or multiple forms of a word or phrase.

Full-text search is applicable in a wide range of business scenarios such as e-businesses—searching for items on a web site; law firms—searching for case histories in a legal-data repository; or human resources departments—matching job descriptions with stored resumes. The basic administrative and development tasks of full-text search are equivalent regardless of business scenarios. However, in a given business scenario, full-text index and queries can be honed to meet business goals. For example, for an e-business maximizing performance might be more important than ranking of results, recall accuracy (how many of the existing matches are actually returned by a full-text query), or supporting multiple languages. For a law firm, returning every possible hit (total recall of information) might be the most important consideration.

For any scenario, a database administrator performs the following basic steps to configure table columns in a database for full-text search:

  1. Create a full-text catalog.

  2. On each table that you want to search, create a full-text index by:

    1. Identify each text columns that you want to include in the full-text index.

    2. If a given column contains documents stored as binary data (varbinary(max), or image data), you must specify a table column (the type column) that identifies the type of each document in the column being indexed.

    3. Specify the language that you want full-text search to use on the documents in the column.

    4. Choose the change-tracking mechanism that you want to use on the full-text index to track changes in the base table and its columns.

Full-text search supports multiple languages through the use of the following linguistic components: word breakers and stemmers, stoplists that contain stopwords (also known as noise words), and thesaurus files. Thesaurus files and, in some cases, stoplists require configuration by a database administrator. A given thesaurus file supports all full-text indexes that use the corresponding language, and a given stoplist can be associated with as many full-text indexes as you want.

For more information, see Administering Full-Text Search.

After the columns have been added to a full-text index, applications and users can run full-text queries on the text in the columns. These queries can search for any of the following:

  • One or more specific words or phrases (simple term)

  • A word or a phrase where the words begin with specified text (prefix term)

  • Inflectional forms of a specific word (generation term)

  • A word or phrase close to another word or phrase (proximity term)

  • Synonymous forms of a specific word (thesaurus)

  • Words or phrases using weighted values (weighted term)

Full-text queries all use a small set of Transact-SQL predicates (CONTAINS and FREETEXT) and functions (CONTAINSTABLE and FREETEXTTABLE). However, the search goals of a given business scenario influence the exact structure of the full-text queries; for example:

  • e-business—searching for a product on a website:

    SELECT product_id FROM products WHERE CONTAINS(product_description, ”Snap Happy 100EZ” OR FORMSOF(THESAURUS,’Snap Happy’) OR ‘100EZ’) AND product_cost<200 …
    
  • Recruitment scenario—searching for job candidates that have experience working with SQL Server:

    SELECT candidate_name,SSN FROM candidates WHERE CONTAINS(candidate_resume,”SQL Server”) AND candidate_division =DBA
    

For more information, see Querying SQL Server Using Full-Text Search.

In contrast to full-text search, the LIKE Transact-SQL predicate works on character patterns only. Also, you cannot use the LIKE predicate to query formatted binary data. Furthermore, a LIKE query against a large amount of unstructured text data is much slower than an equivalent full-text query against the same data. A LIKE query against millions of rows of text data can take minutes to return; whereas a full-text query can take only seconds or less against the same data, depending on the number of rows that are returned.

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft. All rights reserved.