Full-Text Search Fundamentals

This topic briefly describes the components, processes, and terminology associated with Full-Text Search. Full-Text Search shares many terms with Microsoft SQL Server, but there are also a number of terms, such as "crawl" and "token," that are peculiar to Full-Text Search.

Full-Text Search Terminology

Here is a list of terms and components that you need to be familiar with when using Full-Text Search.

  • Full-text index
    Stores information about significant words and their location within a given column. This information is used to quickly compute full-text queries that search for rows with particular words or combinations of words. For more information, see Full-Text Indexes.
  • Full-text catalog
    A full-text catalog contains zero or more full-text indexes. Full-text catalogs must reside on a local hard drive associated with the instance of SQL Server. Each catalog can serve the indexing needs of one or more tables within a database. Full-text catalogs cannot be stored on removable drives, floppy disks, or network drives, except when you attached a read-only database that contains a full-text catalog.
  • Word breaker
    For a given language, a word breaker tokenizes text based on the lexical rules of the language. For more information, see Word Breakers and Stemmers.
  • Token
    Is a word or a character string identified by the word breaker.
  • Stemmer
    For a given language, a stemmer generates inflectional forms of a particular word based on the rules of that language. Stemmers are language specific. For more information, see Word Breakers and Stemmers.
  • Filter
    Given a specified file type, for example .doc, filters extract text from a file stored in a varbinary(max) or image column. For more information, see Full-Text Search Filters.
  • Population or Crawl
    Is the process of creating and maintaining a full-text index. For more information, see Full-Text Index Structure.
  • Noise words
    Are frequently occurring words that do not help the search. For example, for the English locale words such as "a", "and", "is", and "the" are considered noise words. These words are ignored to prevent the full-text index from becoming bloated. For more information, see Noise Words.

Note

Full-text indexing is fully supported in a Microsoft Windows failover cluster environment.

See Also

Concepts

Introduction to Full-Text Search

Other Resources

CREATE FULLTEXT INDEX (Transact-SQL)

Help and Information

Getting SQL Server 2005 Assistance