Noise Words and the CONTAINS Predicate in Enterprise Search SQL Syntax

When you create search queries, remember that words that are very common or carry no meaning about the content are removed when the content is indexed. These noise words cannot be matched in full-text searches. For example, searching for the phrase "it is a test" is equivalent to searching for the word "test", because "it", "is", and "a" are all discarded when the documents are indexed.

Noise words that are discarded from CONTAINS content search terms are treated as placeholders. The phrase being searched for is expected to have the same number of words, but the noise words match any other single word. This can have unexpected results when the noise words are intended by the user as logical operators. For example, a user who wants to search for all documents that contain both "computer" and "software" might type "computer AND software". If the string is inserted into the CONTAINS predicate unchanged, it would be submitted as:

CONTAINS('"computer AND software"')

The Enterprise Search search engine recognizes "AND" as a noise word, and discards it. It then matches all documents in which "computer" and "software" are separated by other noise words. Enterprise Search would return documents containing "computer programming software", "computer drawing software", and even "computer running software". However, documents that contained simply "computer software" would not be returned.

The following CONTAINS predicate would return documents more closely matching the intent of the user:

CONTAINS('"computer" AND "software"')

See Also

Reference

CONTAINS Predicate in Enterprise Search SQL Syntax