preface


Github–Java-Notes, JJVM, HashMap source code analysis, Spring, offer solution (Java version), you can click a star. Check out my Github page, which is updated daily.

I invite you to complete the REPO with me


Keep in mind that

The first thing you should keep in mind is that regardless of whether you pass parameters or not, whatever length you pass in, when you use a HashMap, its length is 2 to the power of n, and the maximum length is 2 to the power of 30

The maximum length

In the HashMap source code, the maximum length constant value is defined like this

/** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two < = 1 < < 30. * /
    static final int MAXIMUM_CAPACITY = 1 << 30;
Copy the code

So where do we use this value?

  • The resize() function, which is used to expand
  • TableSizeFor (), that’s also used for capacity expansion
  • In the constructor
  • PutEntries (), when storing a group of HashMap elements, not a single one

Why does the table length have to be 2 to the n

Note that in the source code they use lazy initialization, which means that the table is initialized only when it is used. If you don’t put it, the length of the table will always be zero.

There are two main functions that guarantee its length to be 2 to the n

  • tableSizeFor()
  • resize()

For the calculation and loading process, please refer to my article on how long a table is

This article analyzes the process of creating a table from the source code, including the above mentioned function call, you must understand why the length of the table must be 2 ^ n

Of course, I wrote a part of the source code for hashMap Chinese annotation on Github also: hashMap source Chinese annotation

What’s the advantage of n of 2

  • Convenient calculation
  • Hash is more evenly distributed

Uniform distribution of

If it’s not 2 to the n, there are some places that will never be used

You can refer to this blog post, where he gives an example of why, why length is 2 to the n

Convenient calculation

  • When the capacity is 2^n, h & (length-1) == h % length

  • It is very convenient to calculate the new location after expansion, compared to JDK1.7

JDK 1.8 changes

In JDK1.8, HashMap has undergone some major changes, including

  • Element migration algorithm (old to new array)

  • Use red black trees

  • Linked lists are tail-inserted

I focus on the element migration algorithm, JDK1.8

Take a look at the Java code and my comments first, and if you want to see the whole thing, look at my Github repository

// Copy all elements from the original array into the new array
if(oldTab ! =null) {for (int j = 0; j < oldCap; j++) {
        Entry e;

        if((e = oldTab[j]) ! =null){
            oldTab[j] = null;

            // There is only one chain in the array
            if(e.next == null) {// recalculate the array index
                newTable[e.h & (newCap-1)] = e;

            }
            // Check whether it is a tree structure
            //else if (e instanceof TreeNode)


            // If it is not a tree, it is a linked list, i.e. it has not evolved into a tree with a length greater than 8
            else{
                // After the expansion, if the index of the element remains the same. That's the lo prefix
                Entry loHead=null, loTail =null;

                // If the index element changes after the expansion, use the prefix "hi"
                Entry hiHead = null, hiTail = null;
                Entry next;
                do {
                    next = e.next;
                    // This is important and difficult to understand,
                    // Determine if the last bit of his hash was 1.
                    // To determine whether he is in the same index or the table length plus the original index
                    if((e.h & oldCap) == 0) {// If loTail == null, it is the first time that this position is added and there is no hash conflict
                        if(loTail == null)
                            loHead = e;
                        else
                            loTail.next = e;
                        loTail = e;
                    }
                    else{
                        if(hiTail == null)
                            loHead = e;
                        elsehiTail.next = e; hiTail = e ; }}while((e = next) ! =null);


                if(loTail ! =null){
                    loTail.next = null;
                    newTable[j] = loHead;
                }

                // The new index equals the old index+oldCap
                else {

                    hiTail.next = null;
                    newTable[j+oldCap] = hiHead;
                }

            }
        }

    }
}
Copy the code

NewTable [j+oldCap] = hiHead; This means that even if our element moves from the old array to the new array, we don’t need to recalculate its hash to the length of the new array, just add the current index value to the length of the original array

The blue ones are the ones that don’t need to be moved, the green ones are the ones that need to be recalculated, and as we can see, it’s just adding 16.

Computing index requirements

We notice that in the source code above, we use this judgment to determine whether the element position needs to be changed after expansion

If ((e.h&oldCap) == 0),

If it is 0, then no change is needed, just use the old index. If it is 1, the new index needs to be used

Why is that?

  • If the index of the element is to change thenhash&(newTable.length-1)It must be andhash&(oldTable.length-1)+oldTable.lengthequal
  • Because the length of the table must be 2 to the n power, that is, oldCap must be 2 to the n power, that is, oldCap has only one 1, and this position is in the highest position;

Let’s take an example:

Let’s assume that the last 12 bits of the hash value of the element are 110111010111, and that the array length was 16, but now it’s 32

And you can try the next time you expand, when you expand to 64, the index doesn’t change. The answer, of course, is no change, because the element’s hash value is 0 at that position

Comparison 1.7 Capacity Expansion

Compare that to JDK1.7, which uses indexFor if it wants to expand and then calculates the index of an element

/** * Returns index for hash code h. */  
    static int indexFor(int h, int length) {  
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";  
        return h & (length-1);  
    }  
Copy the code

Rehash the element’s hash value with the new array length -1, which is cumbersome and inelegant, and not nearly as stunning as I saw in the 1.8 calculation.