1. Proposed several problems of HashMap

Read JDK8 source code small torch certainly in the first read after there will be a doubt, the main question in the following several problems:

2. Why hashCode()?

Hash (Object key); hash(Object key);

    static final int hash(Object key) {
        int h;
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
Copy the code

2.1 Time of deposit

Once you get the hash, you can directly locate the index in the array with a simple bit operation. O(1) is the fastest way to store the index of an array (if there is a hash conflict, then call equals()).

if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
Copy the code

2.2 Fetching time

(if there is a hash conflict, this is also the fastest way to get an element from equals().)

(first = tab[(n - 1) & hash]) ! =null) {
Copy the code

2.3 hashCode ()

Using hashCode() is the fastest way to access both sides of the plane. No one, if you use the way of traversing the number group to find, the day lily is cold, O(n) brother.

3. Why is the initial capacity 16? Or why, given an initial capacity size in the constructor, does it still modify it to be greater than or equal to the lowest power of two to the NTH?

  • Where is the capacity used? This line of code is used when fetching:
tab[i = (n - 1) & hash]
Copy the code

As mentioned above, a simple bit operation is done after getting the hash. Let’s analyze this as an operation:

i = (n - 1) & hash
Copy the code

First give the conclusion:

  1. In order to keep the subscript from crossing the boundary
  2. In order to distribute the hash evenly

3.1 In order not to let the subscript out of bounds

When capacity = n to the power of 2, this is the case for n-1 binary. Let’s look at binary: n= 1000… // start with 1 followed by n zeros. The n – 1 = 0111… // Start with 0 and end with 1

The decimal system N binary N – 1 binary
2 10 01
4 100 011
8 1000 0111
16 10000 01111
32 100000 011111
64 1000000 0111111
128 10000000 01111111
256 100000000 011111111

(n-1) & hash :& means and. If both values are 1, the result of the ampersand operation can never be greater than the value of (n-1), i.e., never greater than or equal to capacity. The maximum index is Capacity-1.

3.2 To distribute hash evenly

& : The result is 1 only when the corresponding is 1.

I = (n-1) hash = I hash = I hash = I hash = I hash = I hash = I hash = I hash

Counter example: if the value of n is not the form of the n power of 2, n-1 will appear binary end is 0, that is: n-1 = ***0 (front n can be 0 or 1), this will cause a life phenomenon?

The index will never fall on an index ending in 1 below the binary. Half of the indexes are idle, doubling the probability of hash collisions on indexes ending in 0 (binary). There is a loss of efficiency in inserts and queries.

4. TableSizeFor () method parsing

Source:

    /** * Returns a power of two size for the given target capacity. */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }
Copy the code

Is greater than or equal to cap to the NTH power of 2:

cap The return value
1 2
5 8
10 16
22 32
50 64
  • First, why cap-1:
int n = cap - 1;
Copy the code

The purpose of reassigning cap-1 to n is to find a target value greater than or equal to the original value. For example, the value is 1000 in binary and 8 in decimal. If you don’t subtract 1 from it, you get the answer 10000, which is 16. Obviously not the result. If you subtract 1 from the binary to 111, you get the original value of 1000, or 8. To reanalyze the highlight, here are the lines:

n |= n >>> 1; Said: n = (n | n > > > 1) | or: corresponding for as long as there are several 1Copy the code
  • Analysis begins: hypothesis
  1. The binary value of n is 01XXXXXXXXxxxxxx…
  2. Move right one bit: 001XXXXXXXXXXXXXX…
  3. And n or the value of n: 011XXXXXXXXXXxxxx…
  4. By the same token, the n | = n > > > 2. After the row of n = 01111 XXXXXXXXXXXXXXXXXXXXXXXXXXXX
  5. By the same token, the n | = n > > > 4. After the row XXXXXXXXXXXXXXXXXXXXXXXX n = 011111111
  6. By the same token, the n | = n > > > 8; After the row of n = 01111111111111111 XXXXXXXXXXXXXXXX
  7. By the same token, the n | = n > > > 16; After the row of n = 011111111111111111111111111111111
  8. If you’re familiar with this, the return is incremented by 1, and you’re back to 100000… , followed by all zeros, that is, this method returns 2 to the NTH power.

4 times 31

In order to make hash more evenly distributed, it is prime.

public int hashCode(a) {
        int h = hash;
        if (h == 0 && value.length > 0) {
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
                h = 31 * h + val[i];
            }
            hash = h;
        }
        return h;
    }
Copy the code

Why choose 31 in particular? Please refer to: [JVM] science: why String hashCode methods select number 31 as multiplier https://blog.csdn.net/happydecai/article/details/80493237

Share to this end, there are incorrect please correct.