1 out of 6 rated this helpful Rate this topic

Object.GetHashCode Method

Serves as a hash function for a particular type.

Namespace:  System
Assembly:  mscorlib (in mscorlib.dll)
public virtual int GetHashCode()

Return Value

Type: System.Int32
A hash code for the current Object.

A hash code is a numeric value that is used to identify an object during equality testing. It can also serve as an index for an object in a collection.

The GetHashCode method is suitable for use in hashing algorithms and data structures such as a hash table.

The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.

The GetHashCode method can be overridden by a derived type. Value types must override this method to provide a hash function that is appropriate for that type and to provide a useful distribution in a hash table. For uniqueness, the hash code must be based on the value of an instance field or property instead of a static field or property.

Objects used as a key in a Hashtable object must also override the GetHashCode method because those objects must generate their own hash code. If an object used as a key does not provide a useful implementation of GetHashCode, you can specify a hash code provider when the Hashtable object is constructed. Prior to the .NET Framework version 2.0, the hash code provider was based on the System.Collections.IHashCodeProvider interface. Starting with version 2.0, the hash code provider is based on the System.Collections.IEqualityComparer interface.

Notes to Implementers

A hash function is used to quickly generate a number (hash code) that corresponds to the value of an object. Hash functions are usually specific to each Type and, for uniqueness, must use at least one of the instance fields as input.

A hash function must have the following properties:

  • If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not compare as equal, the GetHashCode methods for the two object do not have to return different values.

  • The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state that determines the return value of the object's Equals method. Note that this is true only for the current execution of an application, and that a different hash code can be returned if the application is run again.

  • For the best performance, a hash function must generate a random distribution for all input.

For example, the implementation of the GetHashCode method provided by the String class returns identical hash codes for identical string values. Therefore, two String objects return the same hash code if they represent the same string value. Also, the method uses all the characters in the string to generate reasonably randomly distributed output, even when the input is clustered in certain ranges (for example, many users might have strings that contain only the lower 128 ASCII characters, even though a string can contain any of the 65,535 Unicode characters).

For derived classes of Object, the GetHashCode method can delegate to the Object.GetHashCode implementation, if and only if that derived class defines value equality to be reference equality and the type is not a value type.

Providing a good hash function on a class can significantly affect the performance of adding those objects to a hash table. In a hash table with a good implementation of a hash function, searching for an element takes constant time (for example, an O(1) operation). In a hash table with a poor implementation of a hash function, the performance of a search depends on the number of items in the hash table (for example, an O(n) operation, where n is the number of items in the hash table). Hash functions must also be inexpensive to compute.

Implementations of the GetHashCode method must not result in circular references. For example, if ClassA.GetHashCode calls ClassB.GetHashCode, ClassB.GetHashCode must not call ClassA.GetHashCode either directly or indirectly.

Implementations of the GetHashCode method must not throw exceptions.

Derived classes that override GetHashCode must also override Equals to guarantee that two objects considered equal have the same hash code; otherwise, the Hashtable type might not work correctly.

In some cases, the GetHashCode method is implemented to simply return an integer value. The following code example illustrates an implementation of GetHashCode that returns an integer value.

There are other more complicated ways to combine hash codes that may give better performance for hash tables.


using System;

public struct Int32 {
   public int value;

   //other methods...

   public override int GetHashCode() {
      return value;
   }
}


Frequently, a type has multiple data fields that can participate in generating the hash code. One way to generate a hash code is to combine these fields using an XOR (eXclusive OR) operation, as shown in the following code example.


using System;

public struct Point {
   public int x;
   public int y; 

   //other methods

   public override int GetHashCode() {
      return x ^ y;
   }
}


The following code example illustrates another case where the type's fields are combined using XOR (eXclusive OR) to generate the hash code. Notice that in this code example, the fields represent user-defined types, each of which implements GetHashCode and Equals.


using System;

public class SomeType {
   public override int GetHashCode() {
     return 0;
   }
}

public class AnotherType {
   public override int GetHashCode() {
     return 1;
   }
}

public class LastType {
   public override int GetHashCode() {
     return 2;
   }
}

public class MyClass {
   SomeType a = new SomeType();
   AnotherType b = new AnotherType();
   LastType c = new LastType();

   public override int GetHashCode () {
     return a.GetHashCode() ^ b.GetHashCode() ^ c.GetHashCode();
   }
}


If the data member of the derived class is bigger than an Int32, you can combine the high order bits of the value with the low order bits using an XOR (eXclusive OR) operation, as shown in the following code example.


using System;

public struct Int64 {
   public long value;

   //other methods...

   public override int GetHashCode() {
      return ((int)value ^ (int)(value >> 32));
   }
}


.NET Framework

Supported in: 4, 3.5, 3.0, 2.0, 1.1, 1.0

.NET Framework Client Profile

Supported in: 4, 3.5 SP1

Portable Class Library

Supported in: Portable Class Library

Windows 7, Windows Vista SP1 or later, Windows XP SP3, Windows XP SP2 x64 Edition, Windows Server 2008 (Server Core not supported), Windows Server 2008 R2 (Server Core supported with SP1 or later), Windows Server 2003 SP2

The .NET Framework does not support all versions of every platform. For a list of the supported versions, see .NET Framework System Requirements.
Did you find this helpful?
(2000 characters remaining)
Community Content Add
Annotations FAQ
2 code example is bad.
In second code example with the Point struct the hash code is the same for all points where X is equal to Y due to how XOR works. Means that all diagonal points and a origin give the same value 0. Due to commutativity of XOR operator the point X1, Y1 gives the same HashCode as point X2,Y2 if, X1 = Y2 and Y1 = X2. Therefore all points symmetrical across the diagonal are also considered equal. Special care must be taken when using XOR operator. Because, for example "Michael David" will be equal to "David Michael", if you just XOR the hashcodes of First Name and  LastName to get the HashCode of the Customer.
The algorithm

The GetHashCode() returns a pseudo random value, at least in the current (4.0) implementation. Every thread has a state for its own random generator and it sets the seed value to a different one. When the program requests a new hash value, the actual thread will generate a random number. This random number then is stored in the object sync block index or sync block, so can be returned in the next GetHashCode() calls.
I reverse engeneered the algorithm and here is a test implementation in C#: 

public static class HashCodeOracle
{
    private const uint coreSeed = 123456789;
    private const uint maxSyncTrials = 1000;
    private const int syncFlagCount = 6;

    private static ThreadLocal<uint> state = new ThreadLocal<uint>(
                                                    () => GetInitStateForThread());

    private static uint GetInitStateForThread()
    {
        var initState = CalculateThreadSeed();

        var syncObject = new Object();
        var syncValue = syncObject.GetHashCode();

        var syncIterations = 0;
        while (++syncIterations < maxSyncTrials)
        {
            initState = IterateState(initState);

            if ((initState >> syncFlagCount) == syncValue)
            {
                break;
            } // if
        } // while

        if (syncIterations == maxSyncTrials)
        {
            throw new NotSupportedException("Could not sync on current thread");
        } // if

        return initState;
    } // GetInitStateForThread()

    private static uint CalculateThreadSeed()
    {
        var threadSeed = coreSeed;
        var managedThreadId = Thread.CurrentThread.ManagedThreadId;

        for (int i = 0; i < managedThreadId; i++)
        {
            threadSeed = threadSeed * 1566083941u + 1u;
        } // for i

        return threadSeed;
    } // CalculateThreadSeed()

    private static uint IterateState(uint currentState)
    {
        var managedThreadId = (uint)Thread.CurrentThread.ManagedThreadId;
        return currentState * (managedThreadId * 4u + 5u) + 1u;
    } // IterateState()

    public static int NextHashCode
    {
        get
        {
            var currentState = state.Value;
            var newState = IterateState(currentState);
            state.Value = newState;
            return (int)(newState >> syncFlagCount);
        } // get
    } // NextHashCode
} // class HashCodeOracle

class Program
{
    static void Main(string[] args)
    {
        int i;
        for (i = 0; i < 1000; i++)
        {
            Console.WriteLine("predicted: " + HashCodeOracle.NextHashCode);
            var o = new object();
            Console.WriteLine("fact:      " + o.GetHashCode());
        } // for i
 
    } // Main()
} // class Program

This is just for test and there are situations where the program will fail calculating the hash code, so do not use it in real applications.

Regards,
Viktor

 

What is the alternative?
If this method cannot be reliably used, what would be an alternative for ensuring integrity of files (for example) ? $0 $0 Any suggestions? $0 $0 Deep. $0$0 $0 $0Answer: Integrity checks can be done by checksumming a file with hashing algorithms. Commonly used algorithms for creating a checksum are MD5 and SHA1.$0


GetHashCode and LINQ
The documentation must really put emphasis on how it is important to have a correct implementation of the GetHashCode method in the modern versions of the .NET framework. And it has nothing to do with that ancient Hashtable. Many LINQ algorithms, such as Intersect, Distinct, Except, Union, rely on the GetHashCode value. If you don't override it you'll end up with wrong behavior and no warnings at all from the compiler. Also don't rely on the default implementations of the GetHashCode method for value types. In many cases those implementations violate the hash code contract (see Connect IDs 314630, 648567 for example).
Different return values for strings on x86 versus x64
According to the documentation above: "the implementation of the GetHashCode method provided by the String class returns identical hash codes for identical string values."  However, I encountered a problem using this for persistent hash codes across x86 versus x64 execution of an application.  For example the code  $0int test = "test".GetHashCode();$0 returns -354185609 for an application executing under x86versus -871206010 under x64. This experience reinforces the statement above: "the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. "  So it turns out it is not a good idea to use this implementation for any persistent hashes or hashes communicated between different instances of an app.
Can You Share The Algorithm?
Can you share the algorithm used for the GetHashCode method?  Thanks!