Understand HashMap in depth

One, what is a HashMap

In Java, Map is a collection class used to store key-value pairs <key,value>, that is, hash table, and HashMap is the implementation class of Map.

The hash table has the advantages of high storage efficiency and fast query. However, the hash table is not thread synchronous. Keys in a Map cannot be duplicated.

Second, the realization principle

Each array space uses Node<key,value> nodes to store key-value pairs. Under certain conditions, TreeNode<key,value> is used to store data.Copy the code

Three, the introduction of basic attributes

//The default initial capacity - MUST be a power of two
// Default initial capacity - must be a power of 2.
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4
Copy the code

//The maximum capacity, used if a higher value is implicitly specified by either of the constructors with arguments.MUST be a power of two <= 1 < < 30.
If the constructor specifies a larger value, only the maximum value is used, and the size must be a power of 2
static final int MAXIMUM_CAPACITY = 1 << 30
Copy the code

//The load factor used when none specified in constructor
// Load factor used when not specified in constructor (default)
static final float DEFAULT_LOAD_FACTOR = 0.75 f
Copy the code

//The bin count threshold for using a tree rather than list for a bin.
//Bins are converted to trees when adding an element to a bin with at least this many nodes.
//The value must be greater than 2 and should be at least 8 to mesh with assumptions in tree removal about //conversion back to plain bins upon shrinkage.
// Tree threshold: When the number of nodes in an array box is greater than TREEIFY_THRESHOLD, it should be stored in a tree, not a linked list
static final int TREEIFY_THRESHOLD = 8
Copy the code

//The bin count threshold for untreeifying a (split) bin during a resize operation.
//Should be less than TREEIFY_THRESHOLD, and at most 6 to mesh with shrinkage detection under removal.
// If the number of tree nodes is less than 6, the tree is converted to a linked list
static final int UNTREEIFY_THRESHOLD = 6
Copy the code

//The smallest table capacity for which bins may be treeified.
// Minimum threshold for treification: a list can be converted to a tree only if the number of nodes in the hash table is greater than 64
static final int MIN_TREEIFY_CAPACITY = 64
Copy the code

// Hash array of storage nodes
transient Node<K, V>[] table
Copy the code

// The number of nodes in the hash table
transient int size
Copy the code

// Change the number of times HashMap is used for fail-fast
transient int modCount
Copy the code

//The next size value at which to resize (capacity * load factor)
// Next capacity expansion threshold (capacity expansion threshold)
int threshold
Copy the code

//The load factor for the hash table
// Load factor, which is used to adjust the padding of the hash table
final float loadFactor
Copy the code

Constructors

public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
    this.loadFactor = loadFactor;
    // Set the threshold for the next expansion. This is not a real threshold, which will be reset later
    this.threshold = tableSizeFor(initialCapacity);
}
Copy the code

Five, the introduction of common methods

/** * Calculates the hash value of the passed value, which is used to calculate the location of the store * Xor the high 16 bits of the hashCode with the low 16 bits. The main reason for doing this is to better scatter the hash distribution and reduce hash collisions * If the use & the calculated might be to be near zero, the results of calculation result of using | will be near to 1, can't do this exclusive or operation of the uniform distribution * reason: to hash code the randomness of the low * String s = "Hello World!" ; * Integer.toBinaryString(s.hashCode()) * h.hashCode() = 1100 0110 0011 1100 1011 0110 0001 1101 * h.hashCode() >>> 16 = 0000 0000 0000 0000 0000 0000 0000 0000 1100 0110 0011 1100 * 1100 0110 0011 1100 1011 0110 0001 1101 ^ * 0000 0000 0000 0000 1100 0110 0011 1100 * 1100 0110 0011 1100 1000 0110 0001 1100 */
static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code

The size of the hash table is always a power of 2 *. The binary representation of length is always 100... 000, the binary of leng-1 is 1111... 111 * 2. Hash %length = hash&(length-1) when length is equal to a power of 2, but the latter is more efficient * 3. If the length is odd, then length-1 will be even. No matter how to mod or &, the final result will be even, which will cause half of the space waste */
static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
Copy the code

Access to the node

final Node<K, V> getNode(int hash, Object key) {
    Node<K, V>[] tab;
    Node<K, V> first, e;
    int n;
    K k;
    if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null) {
        // Always check to see if the first node is the desired node
        if(first.hash == hash && ((k = first.key) == key || (key ! =null && key.equals(k))))
            return first;
        if((e = first.next) ! =null) {
            // If the first node is a tree, you need to look in the red-black tree
            if (first instanceof TreeNode)
                return ((TreeNode<K, V>) first).getTreeNode(hash, key);
            do {
                // Find the desired node in the list
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    return e;
                // Move behind the list
            } while((e = e.next) ! =null); }}return null;
}

// Find the node
final TreeNode<K, V> getTreeNode(int h, Object k) {
    return((parent ! =null)? root() :this).find(h, k, null);
}

// Get the root node of the tree
final TreeNode<K, V> root(a) {
    for (TreeNode<K, V> r = this, p; ;) {// Only the parent of the root node is empty
        if ((p = r.parent) == null)
            return r;
        // Move to the parent noder = p; }}// Look in the tree
final TreeNode<K, V> find(inth, Object k, Class<? > kc) {
    TreeNode<K, V> p = this;        // Copy the current node
    do {
        int ph, dir;    // Hash value of the current node, direction
        K pk;           // Key of the current node
        TreeNode<K, V> pl = p.left, pr = p.right, q;    // The left child, right child, and q of the current node to return the found result
        // The hash value of the current node is greater than the hash value of the node to find, so move to the left subtree
        if ((ph = p.hash) > h)
            p = pl;
        // The hash value of the current node is less than the hash value of the node to be found, so move to the right subtree
        else if (ph < h)
            p = pr;
        // The hash value and key of the current node are the same
        else if((pk = p.key) == k || (k ! =null && k.equals(pk)))
            return p;
        // If the left subtree is empty, move to the right subtree
        else if (pl == null)
            p = pr;
        // Right subtree is empty, left subtree move judgment
        else if (pr == null)
            p = pl;
        Comparable compares the size of the key of the current traversal node p with that of the key to be searched
        else if((kc ! =null|| (kc = comparableClassFor(k)) ! =null) && (dir = compareComparables(kc, k, pk)) ! =0)
            p = (dir < 0)? pl : pr;< 0, iterate left; > 0, iterate right
        // This step indicates that we cannot compare the size with Comparable, so we continue the recursive search from the right child
        else if((q = pr.find(h, k, kc)) ! =null)
            return q;
        else
            // Do not find it after iterating from the right child, continue to search from the left child
            p = pl;
    } while(p ! =null);
    return null;
}

static int compareComparables(Class
        kc, Object k, Object x) {
    return (x == null|| x.getClass() ! = kc ?0 : ((Comparable) k).compareTo(x));
}
Copy the code

Insert the node

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K, V>[] tab;
        Node<K, V> p;
        int n, i;
        /* * 1, if the hash table is empty, it is created by resize(). So, the time to initialize the list is the first time the put function is called */
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        * If no, create nodes at the array location and insert *; otherwise, hash conflict exists. That is, nodes at the current storage location already exist. B. Determine whether the data structure to be inserted is a red-black tree or linked list */
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K, V> e;       // The node to be modified
            K k;
            /* * a. Check whether the key of the table[I] element is the same as that of the key to be inserted. If so, overwrite the old value */ with the new value
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
                /* * b. Check whether the key-value pair to be inserted is a red-black tree. If it is a red-black tree, insert or update the key-value pair */
            else if (p instanceof TreeNode)
                e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
                /* * c. Update the key-value pairs * I in the list. Check whether there is a key in the table[I]. Check whether there is a key in the table[I]. Check whether there is a key in the table[I]. After the new node is added, you need to determine whether the length of the linked list is greater than 8. If so, you need to tree */
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // Insert to the end of the list
                        p.next = newNode(hash, key, value, null);
                        // If the length of the current list is greater than 8, then you need to tree it
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // If a node with the same key is found, exit
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        break;
                    // Move the node backward to traversep = e; }}/* * Replace the old node with the new value */
            if(e ! =null) { // existing mapping for key
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                // The default implementation is empty, in order to implement the ordered map for LinkedHashMap
                afterNodeAccess(e);
                returnoldValue; }}// Change times +1, used for fail-fast
        ++modCount;
        // The number of nodes exceeds the threshold and needs to be expanded
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
}
Copy the code

final Node<K, V>[] resize() {
    // Hash table before expansion
    Node<K, V>[] oldTab = table;
    // The old hash table length
    int oldCap = (oldTab == null)?0 : oldTab.length;
    // Hash table capacity expansion threshold before capacity expansion
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        // If the old capacity is greater than the maximum, return the old hash table directly
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        } else if
            // If the old capacity is less than half of the maximum capacity and the hash table is already initialized, the new threshold is twice the old threshold
            ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    } else if (oldThr > 0)
        // If the old threshold is greater than 0, the hash table has been initialized, and the new capacity equals the old threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        / / initialization
        newCap = DEFAULT_INITIAL_CAPACITY;
        // The new threshold is equal to the initial capacity * load factor
        newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // If the new threshold is 0
    if (newThr == 0) {
        // Calculate the new threshold
        float ft = (float) newCap * loadFactor;
        // If the new capacity is greater than the maximum and the new threshold is greater than the maximum, the new threshold is the maximum
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ? (int) ft : Integer.MAX_VALUE);
    }
    // Assign a new threshold
    threshold = newThr;
    // Create a new hash table and assign a new hash table
    @SuppressWarnings({"all"})
    Node<K, V>[] newTab = (Node<K, V>[]) new Node[newCap];
    table = newTab;
    if(oldTab ! =null) {
        // Move each bucket into the new bucket
        for (int j = 0; j < oldCap; ++j) {
            Node<K, V> e;
            if((e = oldTab[j]) ! =null) {
                // The old node is empty
                oldTab[j] = null;
                // The current node does not have a rear node
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if
                    // The current node is a tree node, need to insert the tree node and adjust the tree balance
                    (e instanceof TreeNode)
                    ((TreeNode<K, V>) e).split(this, newTab, j, oldCap);
                else {
                    // Process linked list nodes
                    Node<K, V> loHead = null, loTail = null;
                    Node<K, V> hiHead = null, hiTail = null;
                    Node<K, V> next;
                    do {
                        next = e.next;
                        // The original position does not need to be moved
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                / / stern interpolation
                                loTail.next = e;
                            // The tail node moves backwards
                            loTail = e;
                        } else {
                            // The insertion position needs to be recalculated
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            // The tail node moves backwardshiTail = e; }}while((e = next) ! =null);
                    // The original index is placed in the bucket
                    if(loTail ! =null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // Add oldCap to bucket
                    if(hiTail ! =null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}
Copy the code

Five, compare with previous versions

1. Hash value calculation method

/ / 1.7
static final int hash(int h) {
    h ^= k.hashCode(); 
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}
/ / 1.8
static final int hash(Object key) {
    int h;
    return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code

It can be seen from the above that 1.7 has done 5 xOR operations and 4 unsigned right shift operations in total, while 1.8 has done 1 right shift and 1 XOR operations in total, which is more efficient than 1.8

2. Different data structures are stored

1.7 Use array + linked list to solve hash conflicts, and the linked list uses the header interpolation method

1.8 Use array + linked list + red-black tree to solve hash conflicts, linked list uses tail insertion method

The 1.7 data structure also creates a circular linked list

Conditions for the occurrence of problems:

When multiple threads perform put at the same time, if the resize operation is called at the same time, the circular linked list may be generated, which will lead to an infinite loop during the later GET

Causes of the problem:

When you do a put in multiple threads, you’re supposed to move the next pointer of the current node to an empty position in the new array, and the next pointer of the current node points to the node you just put in, thus creating a circular linked list.

3. Capacity expansion mechanisms are different

1.7 Recalculate the insertion position of each original node

1.8 Only the nodes in the first position do not need to recalculate the insertion position, the other nodes are the current position + the old array capacity

One, what is a HashMap

Second, the realization principle

Three, the introduction of basic attributes

Constructors

Five, the introduction of common methods

Five, compare with previous versions

1. Hash value calculation method

2. Different data structures are stored

3. Capacity expansion mechanisms are different

Related Posts

Pure JS barcode scanning implementation

If you’re new to a Laravel project, these are the places to start

Recursively obtain the tree structure of the goods