Click to Rate and Give Feedback
MSDN
MSDN Library
SQL Server
SQL Server 2008
Database Engine
Technical Reference
 CHECKSUM (Transact-SQL)

  Switch on low bandwidth view
Community Content
In this section
Statistics Annotations (0)
Other versions are also available for the following:
SQL Server 2008 Books Online (June 2009)
CHECKSUM (Transact-SQL)

Returns the checksum value computed over a row of a table, or over a list of expressions. CHECKSUM is intended for use in building hash indexes.

Topic link icon Transact-SQL Syntax Conventions

CHECKSUM ( * | expression [ ,...n ] )
*

Specifies that computation is over all the columns of the table. CHECKSUM returns an error if any column is of noncomparable data type. Noncomparable data types are text, ntext, image, XML, and cursor, and also sql_variant with any one of the preceding types as its base type.

expression

Is an expression of any type except a noncomparable data type.

int

CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indexes. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index. This can be used for equality searches over the columns.

CHECKSUM satisfies the properties of a hash function: CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For this definition, null values of a specified type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change. For this reason, we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead. When an MD5 hash algorithm is specified, the probability of HashBytes returning the same result for two different inputs is much lower than that of CHECKSUM.

The order of expressions affects the resultant value of CHECKSUM. The order of columns used with CHECKSUM(*) is the order of columns specified in the table or view definition. This includes computed columns.

The following examples show using CHECKSUM to build hash indexes. The hash index is built by adding a computed checksum column to the table being indexed, and then building an index on the checksum column.

-- Create a checksum index.
SET ARITHABORT ON;
USE AdventureWorks; 
GO
ALTER TABLE Production.Product
ADD cs_Pname AS CHECKSUM(Name);
GO
CREATE INDEX Pname_index ON Production.Product (cs_Pname);
GO

The checksum index can be used as a hash index, particularly to improve indexing speed when the column to be indexed is a long character column. The checksum index can be used for equality searches.

/*Use the index in a SELECT query. Add a second search 
condition to catch stray cases where checksums match, 
but the values are not the same.*/
SELECT * 
FROM Production.Product
WHERE CHECKSUM(N'Bearing Ball') = cs_Pname
AND Name = N'Bearing Ball';
GO

Creating the index on the computed column materializes the checksum column, and any changes to the ProductName value will be propagated to the checksum column. Alternatively, an index could be built directly on the column indexed. However, if the key values are long, a regular index is not likely to perform as well as a checksum index.

Tags What's this?: Add a tag
Community Content   What is Community Content?
Add new content RSS  Annotations
"However, there is a small chance that the checksum will not change."      Mike C_1   |   Edit   |   Show History
CHECKSUM is *not* a collision-free hash function by any means. It generates 32-bit hash values, or checksums, which means collisions can be discovered by brute force with a complexity of 2^16 operations. However, the simple algorithm CHECKSUM uses appears to cycle after 16 bytes of data and collisions are frequent among data of the same length. For instance, "LE" and "AAAAAAAAAAAAAAAALE" both produce the same CHECKSUM value. Also "LE" and "MU" produce the same CHECKSUM value.

In fact, among the 676 possible two-letter uppercase combinations of A - Z ("AA", "AB", etc.) CHECKSUM produces 244 duplicate hash values. This does not represent a small chance for a collision. HashBytes with SHA-1 is the best built-in option available to SQL Server for trying to detect change with a collision-free hash function.
Tags What's this?: Add a tag
Flag as ContentBug
Processing
© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Page view tracker