What is a hash table

The hash function

The implementation steps of hash function in hash table are roughly as follows:

  1. Hash value of key (must be integer)
  2. And the hash of the key is related to the size of the array.

Common as follows:

public int hash(Object key){
    return hash_code(key) % table.length;
}
Copy the code

Hash the key, and then modulo the length of the array. But in general, for efficiency, you can use ampersand instead of %. The premise is to design the length of the array to a power of two.

public int hash(Object key){
    return hash_code(key) & (table.length-1);
}
Copy the code

How do I generate a hash of key

Common types of keys may be strings, integers, floating point numbers, or custom objects. Different types of keys generate hash values differently, but their goals are the same.

  • Try to make the hash value of each key unique
  • Try to include all of the key’s information in the calculation.

In Java, the HashMap key must implement hashCode, equals, or null. A hash value in Java must be of type int.

Int integer

Integer values use the integer itself as the hash value.

Float float

Floating point numbers are also binary in memory. Then you just convert the binary representation to an integer. Float.floattointbits () can be used in Java to convert a floating point number to the corresponding integer. This is then converted to its binary form by calling Integer.tobinaryString ().

For floating-point 10.6, we can have:

int num = Float.floatToIntBits(10.6 f);
System.out.println("The integer for 10.6f is + num);
String binary = Integer.toBinaryString(num);
System.out.println("What is the binary converted from the integer 10.6f?" + binary);

Copy the code

Output result:

10.6F is an integer1093245338
10.6The binary of the integer f is1000001001010011001100110011010
Copy the code

So the hash of floating point 10.6f is the integer 1093245338.

A hash of type Long

For Long types, Java’s official algorithm is as follows: Long is 8-byte, 64-bit.

public static int hashCode(long value) {
    return (int)(value ^ (value >>> 32));
}
Copy the code

Move the value unsigned 32 bits to the right and then xor with the value itself. And that gives you the hash of Long. What are the functions of ^ and >>>? The idea is to have a mixture of high and low 32bit hash values, so that all the information can be used to compute the hash value.

^ Different or, same is 0, different is 1. As | | 1 = 0, 1 0 = 1, 0 | 0 = 0;

As you can see, the hash we end up with is the integer converted from the binary in orange.

A hash of type Double

The hash algorithm for Double is as follows: Double is 8-byte 64-bit like Long, so it is hashed in much the same way as Long. But Double takes an extra step, doubleToLongBits, to convert the decimal of Double to the corresponding Long. This step is similar to that of a float in that a decimal is converted to an integer or long integer.

public static int hashCode(double value) {
    long bits = doubleToLongBits(value);
    return (int)(bits ^ (bits >>> 32));
}
Copy the code

The hash value of a string

The hash value of the string is different from the previous ones.

We can do this by analogy. For example, the number 8536 can be expressed as: 8 × 10^3 + 5 × 10^2 + 3 × 10^1 + 6 × 10^0

So similarly, we can represent a string in a similar way. For example, Sitr can be expressed as S x n^3 + I x n^2 + T x n^1 + r x n^0

The n here represents the base of the character. Then we can extract the common factor of n: n ×(n ×(S × n + I) + t) + r namely: [(S × n + I) × N + T] × n + R

So how do we determine n here? In the HashCode for strings in the JDK, n is treated as 31. That’s because 31 is an odd prime, so it’s easier to get a unique value when you hash it. In the JDK, 31 × I can be optimized to (I << 5) -i.

Note 1: here I represents the number of ANSII corresponding to a character in the string. 31 × I = (2^ 5-1) × I = I × 2^5-i = (I <<5) -i

See HashCode for strings in the JDK:

public int hashCode(a) {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}
Copy the code