HashMap source code parsing

JDK7 version of the source code analysis

The HashMap class is implemented as an array + linked list

Let’s meet some people

Class important attributes

Hash table array, must be 2 power N transient Entry<K,V>[] TABLE = (Entry<K,V>[]) EMPTY_TABLE; Number of included elements TRANSIENT int size; Threshold = Capacity * load factor int threshold. LoadFactor final float loadFactor;Copy the code

List objects in a hash table – important key-value objects

static class Entry<K,V> implements Map.Entry<K,V> { final K key; V value; Entry<K,V> next; // The next object in the list is int hash; Entry(int h, K k, V v, Entry<K,V> n) { value = v; next = n; key = k; hash = h; } public final int hashCode() {return objects.hashCode (getKey()) ^ Objects.hashCode(getValue()); } public final String toString() {// Return getKey() + "=" + getValue(); }... Omit part of the source}Copy the code

The most complex constructor

Look at the most complete constructor, other overloaded constructors do not write, just default method arguments, see

Public HashMap(int initialCapacity, float loadFactor) {// Check two values, If (initialCapacity < 0) throw new IllegalArgumentException("Illegal Initial Capacity: " + initialCapacity); if (initialCapacity > MAXIMUM_CAPACITY) initialCapacity = MAXIMUM_CAPACITY; if (loadFactor <= 0 || Float.isNaN(loadFactor)) throw new IllegalArgumentException("Illegal load factor: " + loadFactor); this.loadFactor = loadFactor; // Initialize load factor threshold = initialCapacity; // Initialize threshold init(); // Empty method, ignore}Copy the code

public V put(K key, V value)

The table constructor is not initialized. It’s actually initialized in the put method

Public V put(K key, V value) {if (table == EMPTY_TABLE) {// inflateTable(threshold); } if (key == null)// Return putForNullKey(value); int hash = hash(key); // Get hashCode int I = indexFor(hash, table.length); For (Entry<K,V> e = table[I]; e ! = null; E = e.ext) {// Iterate over the list Object k at the beginning of the specified array; If (e.h ash = = hash && ((k = e.k ey) = = key | | key. Equals (k))) {/ / e.h ash by addEntry source discovery is the key of hash value / / key hash value, If the keys are equal, the same key already exists in the linked list, and you need to update the oldValue to the new value V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; // If table[I]=null or no key is found in the linked list, add a key-value object to the list at table[I]. return null; }Copy the code

Initialize the table array

Private void inflateTable(int toSize) {// Find a power of 2 >= toSize // Int capacity = roundUpToPowerOf2(toSize); roundUpToPowerOf2(toSize); roundUpToPowerOf2(toSize); //threshold Calculate the true value by capacity * loadFactor threshold = (int) math. min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); // Create an array of tables, default to null table = new Entry[capacity]; initHashSeedAsNeeded(capacity); }Copy the code

How do you compute 2 to the n?

private static int roundUpToPowerOf2(int number) { // assert number >= 0 : "number must be non-negative"; return number >= MAXIMUM_CAPACITY ? MAXIMUM_CAPACITY : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1; HighestOneBit = Integer; highestOneBit = Integer; highestOneBit = Integer; Integer.highestonebit ((number - 1) << 1));Copy the code

Add the key=null element

When the key of put =null, the processing is a little different

Private V putForNullKey(V value) {private V putForNullKey(V value) {private V putForNullKey(V value) {private V putForNullKey(V value) { If the first position in the array is null, add a key-value pair with key= NULL and the value is value, and return null for (Entry<K,V> e = table[0]; e ! = null; e = e.next) { if (e.key == null) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(0, null, value, 0); // Return null when table[0]==null; }Copy the code

Adds a key-value to the specified array position

void addEntry(int hash, K key, V value, int bucketIndex) { if ((size >= threshold) && (null ! = table[bucketIndex])) {resize(2 * table.length); hash = (null ! = key) ? hash(key) : 0; bucketIndex = indexFor(hash, table.length); } // Add a key-value object createEntry(hash, key, value, bucketIndex) at the specified array position; }Copy the code

Void createEntry(int hash, K key, V value, int bucketIndex) {void createEntry(int hash, K key, V value, int bucketIndex) {// Table [bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); size++; }Copy the code

Calculates the array position based on the key’s hashCode and the array’s capacity

static int indexFor(int h, int length) { // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2"; Return h & (length-1); // Return h%(length-1); }Copy the code

Array capacity

Now look again at the implementation of the addEntry method

void addEntry(int hash, K key, V value, int bucketIndex) { if ((size >= threshold) && (null ! = table[bucketIndex])) {resize(2 * table.length); hash = (null ! = key) ? hash(key) : 0; bucketIndex = indexFor(hash, table.length); } // Add an element to the list. See createEntry(Hash, key, value, bucketIndex); }Copy the code

Array table is doubled in size

void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; } // Create an array, newCapacity =2 * table.length, double the size of the original array Entry[] newTable = new Entry[newcapacity]; Transfer (newTable, initHashSeedAsNeeded(newCapacity)); //table refers to the expanded array, that is, the new array to replace the old array table = newTable; // Recalculate threshold = (int) math. min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); }Copy the code

The real expansion, but also the way of head plug

void transfer(Entry[] newTable, boolean rehash) { int newCapacity = newTable.length; For (Entry<K,V> e: table) {// Iterate over each array position while(null! // The next object reference to which the current k-v object points is temporarily saved to next Entry< k, v > next = e.next; // Ignore if (rehash) {e.hash = null == e.key? 0 : hash(e.key); } // Calculate the position in the new array after the expansion int I = indexFor(e.hash, newCapacity); // The current k-v object points to the next k-v object. If newTable[I] is null, it points to null. Otherwise point to the k-v object e.ext = newTable[I]; NewTable [I] = e; newTable[I] = e; // Move the reference pointer e to the next node e = next; }}}Copy the code

public V get(Object key)

Next, parse the GET method, which is fairly simple

Public V get(Object key) {// Key =null return getForNullKey(); //key! Null Query Entry<K,V> Entry = getEntry(key); return null == entry ? null : entry.getValue(); }Copy the code

Query value where key=null

private V getForNullKey() { if (size == 0) { return null; } for (Entry<K,V> e = table[0]; e ! = null; e = e.next) { if (e.key == null) return e.value; } return null; }Copy the code

Query is the key! = null value

final Entry<K,V> getEntry(Object key) { if (size == 0) { return null; } int hash = (key == null) ? 0 : hash(key); for (Entry<K,V> e = table[indexFor(hash, table.length)]; e ! = null; E = e.next) {// Locate the array by the key hash and iterate over the list Object k represented by the array; if (e.hash == hash && ((k = e.key) == key || (key ! = null &&key. equals(k))))) } return null; }Copy the code

Thread safety

Review key steps

From the above analysis, it can be seen that there is an expansion operation transfer in the PUT method. Now take out a few steps:

for (Entry<K,V> e : table) {
    while(null != e) {
        Entry<K,V> next = e.next;
        int i = indexFor(e.hash, newCapacity);
        e.next = newTable[i];
        newTable[i] = e;
        e = next;
    }
}
Copy the code

Analyze a thread-safe situation

Thread-safe situations are not limited to this.

Suppose you have two threads, T1 and T2, simultaneously rehash the first linked list element of table[I] into the new array.

Analysis 1: T1 run newTable[I] = e; NewTable [I] =null; newTable[I] =null Thread T1 is stopped;

Analysis 2: T2 start from Entry<K,V> next = e.next; Next =null after Entry<K,V> next= e.next = newTable[I]; E.next points to itself, e, forming a loop; E = next; When e==null, the while exits and the next for loop begins

As can be seen from the above, circular linked lists will appear after expansion. Then, by analyzing the GET method before, it can be seen that the condition of the for loop in GET is always met, but the condition of if is always not met, there will be an infinite loop of query, and the GET method cannot be returned all the time, which may lead to 100% CPU

The capacity has to be two to the NTH, right?

Yeah, yeah

Two reasons

Make elements as evenly distributed as possible to reduce hash collisions
Take advantage of binary computing efficiency. H %(length-1) is equal to h & (Length-1) at 2 to the n, and bitwise sum is more efficient