CONTAINS (Transact-SQL)

Is a predicate used to search columns containing character-based data types for precise or fuzzy (less precise) matches to single words and phrases, the proximity of words within a certain distance of one another, or weighted matches.

In SQL Server 2005, you can use four-part names in CONTAINS or FREETEXT full-text predicates to execute queries against linked servers.

CONTAINS can search for:

  • A word or phrase.
  • The prefix of a word or phrase.
  • A word near another word.
  • A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
  • A word that is a synonym of another word using thesaurus (for example, the word metal can have synonyms such as aluminum and steel).

Topic link iconTransact-SQL Syntax Conventions

Syntax

CONTAINS
      ( { column_name | (column_list) | * } 
          , '< contains_search_condition >'     
   [ , LANGUAGE language_term ]
      ) 
< contains_search_condition > ::= 
    { < simple_term > 
    | < prefix_term > 
    | < generation_term > 
    | < proximity_term > 
    | < weighted_term > 
    } 
    | { ( < contains_search_condition > ) 
    [ { < AND > | < AND NOT > | < OR > } ] 
    < contains_search_condition > [ ...n ] 
    } 
< simple_term > ::= 
          word | " phrase "
< prefix term > ::= 
     { "word * " | "phrase *" }
< generation_term > ::= 
     FORMSOF ( { INFLECTIONAL | THESAURUS } , < simple_term > [ ,...n ] ) 
< proximity_term > ::= 
     { < simple_term > | < prefix_term > } 
     { { NEAR | ~ }
     { < simple_term > | < prefix_term > } 
     } [ ...n ] 
< weighted_term > ::= 
     ISABOUT 
        ( { { 
  < simple_term > 
  | < prefix_term > 
  | < generation_term > 
  | < proximity_term > 
  } 
   [ WEIGHT ( weight_value ) ] 
   } [ ,...n ] 
        ) 
< AND > ::= 
     { AND | & }
< AND NOT > ::= 
     { AND NOT | & !}
< OR > ::= 
     { OR | | }

Arguments

  • column_name
    Is the name of the column or columns included in the full-text index. Columns of type char, varchar, nchar, nvarchar, text, ntext, image, xml, and varbinary(max) are valid columns for full-text searching.
  • column_list
    Indicates that several columns, separated by a comma, can be specified. column_list must be enclosed in parentheses. Unless language_term is specified, the language of all columns of column_list must be the same.
  • *
    Specifies that all columns in the table registered for full-text searching should be used to search for the given contains search condition. The columns in the CONTAINS clause must come from a single table. If more than one table is in the FROM clause, * must be qualified by the table name. Unless language_term is specified, the language of all columns of the table must be the same.
  • LANGUAGE language_term
    Is the language in which the user is issuing the query. If the column contains several documents stored as binary large objects (BLOBs), the language used to index the actual document content is determined by the locale identifier (LCID) of the document. Consequently, a single column can be storing multiple languages. The LANGUAGE parameter allows users to specify the language in which they are searching, thus increasing the probability of a good match.

    This parameter is optional and can be specified as a string, integer, or hexadecimal value corresponding to the LCID of a language. If language_term is specified, the language it represents will be applied to all elements of the search condition. If no value is specified, the column full-text language is used.

    When specified as a string, language_term corresponds to the alias column value in the syslanguages system table. The string must be enclosed in single quotation marks, as in 'language_term'. When specified as an integer, language_term is the actual LCID that identifies the language. When specified as a hexadecimal value, language_term is 0x followed by the hexadecimal value of the LCID. The hexadecimal value must not exceed eight digits, including leading zeros.

    If the value is in double-byte character set (DBCS) format, SQL Server will convert it to Unicode.

    If the language specified is not valid or there are no resources installed that correspond to that language, SQL Server returns an error. To use the neutral language resources, specify 0x0 as language_term.

  • <contains_search_condition>
    Specifies the text to search for in column_name and the conditions for a match.

    contains_search_condition is nvarchar. An implicit conversion occurs when another character data type is used as input. In the following example, the @SearchWord variable, which is defined as varchar(30), causes an implicit conversion in the CONTAINS predicate.

    USE AdventureWorks;
    GO
    DECLARE @SearchWord varchar(30)
    SET @SearchWord ='performance'
    SELECT Description 
    FROM Production.ProductDescription 
    WHERE CONTAINS(Description, @SearchWord);
    

    Because "parameter sniffing" does not work across conversion, use nvarchar for better performance. In the example, declare @SearchWord as nvarchar(30).

    USE AdventureWorks;
    GO
    DECLARE @SearchWord nvarchar(30)
    SET @SearchWord = N'performance'
    SELECT Description 
    FROM Production.ProductDescription 
    WHERE CONTAINS(Description, @SearchWord);
    

    You can also use the OPTIMIZE FOR query hint for cases in which a non optimal plan is generated.

  • word
    Is a string of characters without spaces or punctuation.
  • phrase
    Is one or more words with spaces between each word.

    Note

    Some languages, such as those written in some parts of Asia, can have phrases that consist of one or more words without spaces between them.

  • <simple_term>
    Specifies a match for an exact word or a phrase. Examples of valid simple terms are "blue berry", blueberry, and "Microsoft SQL Server". Phrases should be enclosed in double quotation marks (""). Words in a phrase must appear in the same order as specified in <contains_search_condition> as they appear in the database column. The search for characters in the word or phrase is not case sensitive. Noise words (such as a, and, or the) in full-text indexed columns are not stored in the full-text index. If a noise word is used in a single word search, SQL Server returns an error message indicating that the query contains only noise words. SQL Server includes a standard list of noise words in the directory \Mssql\Binn\FTERef of each instance of SQL Server.

    Punctuation is ignored. Therefore, CONTAINS(testing, "computer failure") matches a row with the value, "Where is my computer? Failure to find it would be expensive." For more information on word-breaker behavior, see Word Breakers and Stemmers.

  • <prefix_term>
    Specifies a match of words or phrases beginning with the specified text. Enclose a prefix term in double quotation marks ("") and add an asterisk (*) before the ending quotation mark, so that all text starting with the simple term specified before the asterisk is matched. The clause should be specified this way: CONTAINS (column, '"text*"'). The asterisk matches zero, one, or more characters (of the root word or words in the word or phrase). If the text and asterisk are not delimited by double quotation marks, so the predicate reads CONTAINS (column, 'text*'), full-text search considers the asterisk as a character and searches for exact matches to text*. The full-text engine will not find words with the asterisk (*) character because word breakers typically ignore such characters.

    When <prefix_term> is a phrase, each word contained in the phrase is considered to be a separate prefix. Therefore, a query specifying a prefix term of "local wine*" matches any rows with the text of "local winery", "locally wined and dined", and so on.

  • <generation_term>
    Specifies a match of words when the included simple terms include variants of the original word for which to search.
  • INFLECTIONAL
    Specifies that the language-dependent stemmer is to be used on the specified simple term. Stemmer behavior is defined based on stemming rules of each specific language. The neutral language does not have an associated stemmer. The column language of the column(s) being queried is used to refer to the desired stemmer. If language_term is specified, the stemmer corresponding to that language is used.

    A given <simple_term> within a <generation_term> will not match both nouns and verbs.

  • THESAURUS
    Specifies that the thesaurus corresponding to the column full-text language, or the language specified in the query is used. The longest pattern or patterns from the simple_term are matched against the thesaurus and additional terms are generated to expand or replace the original pattern. If a match is not found for all or part of the simple_term, the non-matching portion is treated as a simple_term. For more information on the Full-Text Search thesaurus, see Thesaurus.
  • <proximity_term>
    Specifies a match of words or phrases that must be close to one another. <proximity_term> operates similarly to the AND operator: both require that more than one word or phrase exist in the column being searched. As the words in <proximity_term> appear closer together, the better the match.

    • NEAR | ~
      Indicates that the word or phrase on the left side of the NEAR or ~ operator should be approximately close to the word or phrase on the right side of the NEAR or ~ operator. Multiple proximity terms can be chained, for example:

      a NEAR b NEAR c 
      

      This means that word or phrase a should be near word or phrase b, which should be near word or phrase c. When several proximity terms are chained, all proximity terms must be near each other. So, in the example a ~ b ~ c, word or phrase a should also be near word or phrase c.

  • <weighted_term>
    Specifies that the matching rows (returned by the query) match a list of words and phrases, each optionally given a weighting value.
  • ISABOUT
    Specifies the <weighted_term> keyword.

    • WEIGHT(weight_value)
      Specifies a weight value, which is a number from 0.0 through 1.0. Each component in <weighted_term> may include a weight_value. weight_value is a way to change how various portions of a query affect the rank value assigned to each row matching the query. WEIGHT does not affect the results of CONTAINS queries, but WEIGHT impacts rank in CONTAINSTABLE queries. For more information, see CONTAINSTABLE (Transact-SQL).

      Note

      The decimal separator is always a period regardless of the operating system locale.

  • { AND | & } | { AND NOT | &! } | { OR | | }
    Specifies a logical operation between two contains search conditions.

    • AND | &
      Indicates that the two contains search conditions must be met for a match. The ampersand symbol (&) may be used instead of the AND keyword to represent the AND operator.
    • AND NOT | &!
      Indicates that the second search condition must not be present for a match. The ampersand followed by the exclamation mark symbol (&!) may be used instead of the AND NOT keyword to represent the AND NOT operator.
    • OR | |
      Indicates that either of the two contains search conditions must be met for a match. The bar symbol (|) may be used instead of the OR keyword to represent the OR operator.

      When <contains_search_condition> contains parenthesized groups, these parenthesized groups are evaluated first. After evaluating parenthesized groups, these rules apply when using these logical operators with contains search conditions:

      • NOT is applied before AND.
      • NOT can only occur after AND, as in AND NOT. The OR NOT operator is not allowed. NOT cannot be specified before the first term. For example, CONTAINS (mycolumn, 'NOT "phrase_to_search_for" ' ) is not valid.
      • AND is applied before OR.
      • Boolean operators of the same type (AND, OR) are associative and can therefore be applied in any order.
      • n
        Is a placeholder indicating that multiple contains search conditions and terms within them can be specified.

Remarks

CONTAINS is not recognized as a keyword if the compatibility level is less than 70. For more information, see sp_dbcmptlevel (Transact-SQL).

Examples

A. Using CONTAINS with <simple_term>

The following example finds all products with a price of $80.99 that contain the word "Mountain".

USE AdventureWorks;
GO
SELECT Name, ListPrice
FROM Production.Product
WHERE ListPrice = 80.99
   AND CONTAINS(Name, 'Mountain');
GO

B. Using CONTAINS and phrase in <simple_term>

The following example returns all products that contain either the phrase "Mountain" or "Road".

USE AdventureWorks;
GO
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, ' "Mountain" OR "Road" ')
GO

C. Using CONTAINS with <prefix_term>

The following example returns all product names with at least one word starting with the prefix chain in the Name column.

USE AdventureWorks;
GO
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, ' "Chain*" ');
GO

D. Using CONTAINS and OR with <prefix_term>

The following example returns all category descriptions containing strings with prefixes of either "chain" or "full".

USE AdventureWorks;
GO
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, '"chain*" OR "full*"');
GO

E. Using CONTAINS with <proximity_term>

The following example returns all product names that have the word bike near the word performance.

USE AdventureWorks;
GO
SELECT Description
FROM Production.ProductDescription
WHERE CONTAINS(Description, 'bike NEAR performance');
GO

F. Using CONTAINS with <generation_term>

The following example searches for all products with words of the form ride: riding, ridden, and so on.

USE AdventureWorks;
GO
SELECT Description
FROM Production.ProductDescription
WHERE CONTAINS(Description, ' FORMSOF (INFLECTIONAL, ride) ');
GO

G. Using CONTAINS with <weighted_term>

The following example searches for all product names containing the words performance, comfortable, or smooth, and different weightings are given to each word.

USE AdventureWorks;
GO
SELECT Description
FROM Production.ProductDescription
WHERE CONTAINS(Description, 'ISABOUT (performance weight (.8), 
comfortable weight (.4), smooth weight (.2) )' );
GO

H. Using CONTAINS with variables

The following example uses a variable instead of a specific search term.

USE AdventureWorks;
GO
DECLARE @SearchWord nvarchar(30)
SET @SearchWord = N'Performance'
SELECT Description 
FROM Production.ProductDescription 
WHERE CONTAINS(Description, @SearchWord);
GO

See Also

Reference

CONTAINSTABLE (Transact-SQL)
FREETEXT (Transact-SQL)
FREETEXTTABLE (Transact-SQL)
WHERE (Transact-SQL)

Other Resources

Full-Text Search
Full-Text Search Queries

Help and Information

Getting SQL Server 2005 Assistance

Change History

Release History

14 April 2006

New content:
  • Added information about avoiding conversion with contains_search_condition.

15 September 2007

New content:
  • Added a note about the decimal separator to the ISABOUT definition.