3 out of 3 rated this helpful - Rate this topic

SOUNDEX (Transact-SQL)

Returns a four-character (SOUNDEX) code to evaluate the similarity of two strings.

Topic link icon Transact-SQL Syntax Conventions


          

SOUNDEX ( character_expression )
        
character_expression

Is an alphanumeric expression of character data. character_expression can be a constant, variable, or column.

SOUNDEX converts an alphanumeric string to a four-character code to find similar-sounding words or names. The first character of the code is the first character of character_expression and the second through fourth characters of the code are numbers that represent the letters in the expression. Vowels in character_expression are ignored unless they are the first letter of the string. Zeroes are added at the end if necessary to produce a four-character code.

The following tables defines the numbers that represent the various letters.

Number

Represents the Letters

1

B, F, P, V

2

C, G, J, K, Q, S, X, Z

3

D, T

4

L

5

M, N

6

R

Ignored

A, E, I, O, U, H, W, and Y.

For example, the SOUNDEX code for the expression 'Washington' is W252. W, 2 for the S, 5 for the N, 2 for the G. The remaining letters are disregarded. For more information about the SOUNDEX code, see The Soundex Indexing System.

String functions can be nested.

Under compatibility level 110, SOUNDEX applies the following rules. Under compatibility levels 90 and 100, only rules 1 and 3 are followed.

  1. If character_expression has any double letters, they are treated as one letter. For example, in the name Gutierrez, only the first r is considered. The second r is ignored.

  2. If character_expression has different letters side-by-side that have the same number in the soundex coding guide, they are treated as one letter. For example, the name Jackson is coded as J250 (J, 2 for the C, K ignored, S ignored, 5 for the N, 0 added).

  3. If a vowel (A, E, I, O, U) separates two consonants that have the same soundex code, the consonant to the right of the vowel is coded.

  4. If H or W separate two consonants that have the same soundex code, the consonant to the right of the vowel is not coded. For example, the name Ashcraft is coded as A261 (A, 2 for the S, C ignored, 6 for the R, 1 for the F).

After upgrading to compatibility level 110, you may need to rebuild the indexes, heaps, or CHECK constraints that use the SOUNDEX function.

  • A heap that contains a persisted computed column defined with SOUNDEX cannot be queried until the heap is rebuilt by running the statement ALTER TABLE <table> REBUILD.

  • CHECK constraints defined with SOUNDEX are disabled upon upgrade. To enable the constraint, run the statement ALTER TABLE <table> WITH CHECK CHECK CONSTRAINT ALL.

  • Indexes (including indexed views) that contain a persisted computed column defined with SOUNDEX cannot be queried until the index is rebuilt by running the statement ALTER INDEX ALL ON <object> REBUILD.

The following example shows the SOUNDEX function and the related DIFFERENCE function. In the first example, the standard SOUNDEX values are returned for all consonants. Returning the SOUNDEX for Smith and Smythe returns the same SOUNDEX result because all vowels, the letter y, doubled letters, and the letter h, are not included.

-- Using SOUNDEX
SELECT SOUNDEX ('Smith'), SOUNDEX ('Smythe');

Here is the result set.


----- ----- 
S530  S530  

(1 row(s) affected)

The DIFFERENCE function compares the difference of the SOUNDEX pattern results. The following example shows two strings that differ only in vowels. The difference returned is 4, the lowest possible difference.

-- Using DIFFERENCE
SELECT DIFFERENCE('Smithers', 'Smythers');
GO

Here is the result set.

----------- 
4           

(1 row(s) affected)

In the following example, the strings differ in consonants; therefore, the difference returned is 2, the greater difference.

SELECT DIFFERENCE('Anothers', 'Brothers');
GO

Here is the result set.

----------- 
2           

(1 row(s) affected)
Did you find this helpful?
(1500 characters remaining)
Community Content Add
Annotations FAQ