Public number: byte array, hope to help you 🤣🤣

This series of articles will be on the Java and Android collection framework (JDK 1.8, Android SDK 30) in several common containers combined with the source code to introduce, understand different containers in the data structure, application scenarios, advantages of the different points, I hope to help you 🤣🤣

A, a HashMap

A HashMap is a data type used to store key-value pairs. It is an asynchronous implementation of the Map interface based on a hash table. The key can be null, duplicate keys are not allowed to be inserted, and duplicate values are allowed

A HashMap is actually a combination of an array, a linked list, and a red-black tree. The underlying element contains an array in which each element has four possible types: NULL, a single node, a linked list, and a red-black tree (JDK1.8 started using red-black trees to improve element lookup). When the deposit to HashMap elements, according to the key’s hash value is the first element in the array position (i.e., the array subscript), if there are other elements have been stored on the location, so in the position of the element will be in the form of a list or a red-black tree to store, if there is no element, the location is directly to the location to store elements. Therefore, a HashMap requires that the key be immutable, that is, the hash value of the key cannot change, otherwise it will not be able to locate its location during subsequent accesses

1, the hash

A Hash, commonly translated as a Hash, is a Hash algorithm that transforms any input object into a fixed length output, which is the Hash value. Different inputs may be hashed to the same output, so it is not possible to determine a unique input value from the hash value, but you can use the hash value as a feature of this object

The role of hashing can be illustrated by an example. Let’s say you have a thousand words and you need to find the location index of the word “hello.” The most intuitive thing to do is to store the words in an array of a thousand length and iterate over them. The worst thing you can do is iterate over them a thousand times. The more words you have, the more array space you need and the higher the average number of traversals you need. In order to save memory space and reduce the number of iterations, we can use the hash algorithm to take the hash value of each word, map the hash value to an index value in an array of length of one hundred, and save the corresponding word in the index position. If the hash algorithm used is good enough, the hash values of different words can be very random, so that a thousand words can be evenly distributed in the array. The best case scenario is to hold only ten words in each array position, and these ten words can be concatenated in a linked list or other data structure. In this way, we only need to calculate the index value of “hello” and then traverse 10 words in that index position. If the array space is large enough and the hashing algorithm results in a sufficiently uniform index value, then at best only one lookup is needed to get the target result, and at worst only all words in that position are searched, greatly reducing the number of traversals

Inside a HashMap, hash algorithms are used to store elements. However, since the hash algorithm may hash the same output for different inputs, and the array space cannot be infinite, it is inevitable to store multiple elements in the same array location, which is called hash conflict. In addition, a HashMap does not guarantee the order in which elements are stored or iterated, because the HashMap rehashes the elements as needed, and the order of the elements is reshuffled, so the storage order and iteration order can change at different times. In addition, HashMap is not thread-safe, and multiple threads writing at the same time can result in data corruption and even thread deadlocks

Class declaration

	public class HashMap<K.V> extends AbstractMap<K.V> 
        implements Map<K.V>, Cloneable.Serializable
Copy the code

3, constant

The global constants in a HashMap focus on the following

	// Hash the default capacity of the bucket array
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;

    // The maximum size of the hash bucket array
    static final int MAXIMUM_CAPACITY = 1 << 30;
	
	// Load factor
    static final float DEFAULT_LOAD_FACTOR = 0.75 f;

    // For efficiency, when the length of the list exceeds this value, the list is converted to a red-black tree
    static final int TREEIFY_THRESHOLD = 8;
	
	// When the length of the red-black tree is less than this value, the red-black tree is converted to a linked list
	static final int UNTREEIFY_THRESHOLD = 6;
Copy the code

The load factor is used to specify the maximum proportion of data that occupies the capacity of the array before it is automatically expanded, that is, when the amount of data occupies the capacity of the array reaches this proportion, the array will be automatically expanded. The load factor measures how much space is used in a hash table. A larger load factor indicates a higher degree of loading of the hash table, and vice versa. For hash tables that use linked lists, the average time to find an element is O(1+a), so the larger the load factor, the higher the space utilization, and correspondingly the lower the search efficiency. If the load factor is too small, the data in the array will be too sparse, the utilization of space will be low, and the corresponding search efficiency will be improved

The official default load factor size is DEFAULT_LOAD_FACTOR, or 0.75, which is the result of balancing space utilization with lookup efficiency. In practice, if the memory space is large and the time efficiency is high, you can choose to reduce the load factor size; If memory space is tight and time efficiency is not high, you can choose to increase the load factor

In addition, even if the load factor and hash algorithm are well designed, it is inevitable that the list length will be too long due to hash conflicts, which will affect the performance of the HashMap. In order to optimize the performance, red black tree is introduced from JDK1.8. When the length of the list exceeds the value specified by TREEIFY_THRESHOLD, the list will be converted to red black tree. The feature of red black tree can be used to improve the performance of HashMap

4, variables,

    // Hash bucket array, initialized when first used
    // The capacity value should be an integer multiple of 2
    transient Node<K, V>[] table;

    /** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */
    transient Set<Map.Entry<K, V>> entrySet;

    / / the size of the Map
    transient int size;

    // This parameter is incremented each time the Map structure changes
    // When iterating over a Map, the iterator checks this value
    // If the value of this parameter changes, it indicates that the Map structure has changed during the iteration, so an exception will be thrown directly
    transient int modCount;

    // The size of the array will be expanded when the size of the array reaches this value
    // Calculate method: current capacity x load factor
    int threshold;

    // The load factor value to use
    final float loadFactor;
Copy the code

constructors

	// Set the Map's initial size and load factor
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }

    // Set the initialization size
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    // Use the default value
    public HashMap(a) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
    }

    // Pass in initial data
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }
Copy the code

6. Insert key-value pairs

As mentioned above, a HashMap is a combination of an array + a linked list + a red-black tree. Each element in the array has four possible types: NULL, a single node, a linked list, and a red-black tree

Each key-value pair to be inserted is wrapped as a Node object, and the Node object’s position in the array is determined by the hash of the key. If the computed location does not contain a value, simply place the Node object at that location; If a value is included, a hash collision has occurred, and the Node object needs to be inserted into a linked list or red-black tree. If the key is equal to the key of an existing Node in a linked list or red-black tree (the hash value is equal and equals equals equals), the new Node object overwrites the original data

When the calculation results of hash algorithm are more dispersed and uniform, the probability of hash collision is smaller, and the access efficiency of HashMap will be higher

The Node class is declared as follows

    static class Node<K.V> implements Map.Entry<K.V> {
        
        // Hash of key
        final int hash;
        final K key;
        V value;
        // Next node
        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }

        public final K getKey(a)        { return key; }
        public final V getValue(a)      { return value; }
        public final String toString(a) { return key + "=" + value; }

        public final int hashCode(a) {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

        public final V setValue(V newValue) {
            V oldValue = value;
            value = newValue;
            return oldValue;
        }

        public final boolean equals(Object o) {
            if (o == this)
                return true;
            if (o instanceofMap.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
                    Objects.equals(value, e.getValue()))
                    return true;
            }
            return false; }}Copy the code

The way to insert a key-value pair is put(K key, V value)

	public V put(K key, V value) {
        return putVal(hash(key), key, value, false.true);
    }

    // Calculate the hash value of key
    static final int hash(Object key) {
        int h;
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }
Copy the code

The putVal method is complicated because it takes into account the following situations:

  1. If the table is not initialized or its capacity is 0, initialize and expand the table
  2. Determine if there is a hash conflict
  3. If there are no hash conflicts, the key-value pair is stored directly to the computed location
  4. If there is a hash conflict, the key-value pair is added to the red-black tree or linked list at that location, and the list is converted to a red-black tree when the chain expression reaches its maximum length
  5. If there are nodes with the same key, determine whether to overwrite the old value
  6. Reserve a burial point for the LinkedHashMap method
  7. After the key-value pair is saved, perform the necessary expansion
	/ * * *@param hash         hash for key
     * @param key          the key
     * @param value        the value to put
     * @paramIf onlyIfAbsent is true, non-null values with the same key will not be overwritten. Otherwise, the original value * will be overwritten@param evict        if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K, V>[] tab;
        Node<K, V> p;
        int n, i;
        // If the table is not initialized or has a capacity of 0, call resize to initialize it
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        
        // Determine whether the key to be stored has a hash conflict
        //p points to the location in the array where the key-value pair is expected to be stored
        // if p is null, there is no conflict
        if ((p = tab[i = (n - 1) & hash]) == null)
            // Build the node containing the element to be stored directly at index I
            tab[i] = newNode(hash, key, value, null);
        
        else { // Enter this branch, indicating that the key to be stored has a hash conflict
            
            Node<K, V> e;
            K k;
            // The p value was assigned in the previous if statement, so we can determine the equality of Node keys directly
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                // will walk in here, indicating that the key of node P is equal to the key pair to be stored
                // This position may have only one node, or it may be a red-black tree or a linked list,
                // then e refers to the conflict node
                // Now you have found the location where the key is stored
                e = p;
            
            // If the Node key is not equal and the head Node is of type TreeNode, the current position uses red-black tree to handle hash collisions
            else if (p instanceof TreeNode)
                // If the same key does not exist in the tree, insert and return null; otherwise, do not save and return the node with the same key
                e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
            
            else { // This location currently uses linked lists to handle hash collisions
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        // will enter here, indicating that the end of the list is traversed, and every node in the list has an unequal key
                        // Add it to the end of the list
                        p.next = newNode(hash, key, value, null);
                        // If the list length has reached the maximum allowed length, then the list is converted to a red-black tree
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }   
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        // find the node with the same key, e
                        break; p = e; }}// If e! = null: indicates that the key-value pair with the same key exists
            // If value needs to be overwritten
            if(e ! =null) {    
                V oldValue = e.value;       
                // If onlyIfAbsent is false or oldValue is null, the original value is overwritten
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                
                // Used for LinkedHashMap, null implementation in HashMap
                afterNodeAccess(e);
                return oldValue;
            }
        }
        
        ++modCount;
        
        // Determine whether capacity expansion is required
        if (++size > threshold)
            resize();
        
        // Used for LinkedHashMap, null implementation in HashMap
        afterNodeInsertion(evict);
        return null;
    }
Copy the code

7. Obtain value

The corresponding method to obtain a value is get(Object key)

	public V get(Object key) {
        Node<K, V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

    // Get the node by key
    final Node<K, V> getNode(int hash, Object key) {
        Node<K, V>[] tab;
        Node<K, V> first, e;
        int n;
        K k;
        // The key can exist only when the table is not empty and the hash position is not null
        if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null) {
            if(first.hash == hash && ((k = first.key) == key || (key ! =null && key.equals(k))))
                // If the value is equal to the header, the corresponding value is found
                return first;
            // e ! = null indicates either a linked list or a red-black tree exists at the location, so get from either
            if((e = first.next) ! =null) {
                if (first instanceof TreeNode) / / the red-black tree
                    return ((TreeNode<K, V>) first).getTreeNode(hash, key);
                do { / / list
                    if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                        return e;
                } while((e = e.next) ! =null); }}return null;
    }
Copy the code

8. Remove the node

Removing key-value pairs from a Map represents removing a reference to a Node object from its underlying data structure, which could be an array, red-black tree, or linked list

	// If the key exists, return the corresponding value, otherwise return null
	public V remove(Object key) {
        Node<K, V> e;
        return (e = removeNode(hash(key), key, null.false.true)) = =null ?
                null : e.value;
    }

    / * * *@paramValue Key Specifies the value corresponding to the key. This value is required only when matchValue is true. Otherwise, the value * is ignored@paramIf matchValue is true, the node will be removed only if the key and value match, otherwise the element will be removed as long as the key is equal@param movable if false do not move other nodes while removing
     * @return the node, or null if none
     */
    final Node<K, V> removeNode(int hash, Object key, Object value,
                                boolean matchValue, boolean movable) {
        Node<K, V>[] tab;
        Node<K, V> p;
        int n, index;
        // The key can exist only when the table is not empty and the hash position is not null
        if((tab = table) ! =null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) ! =null) {
            Node<K, V> node = null, e;
            K k;
            V v;
            if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
                // If the key is equal to the head p, then the target node has been found
                node = p;
            else if((e = p.next) ! =null) { // There are red black trees or linked lists
                if (p instanceof TreeNode) / / the red-black tree
                    node = ((TreeNode<K, V>) p).getTreeNode(hash, key);
                else { / / list
                    do {
                        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
                            node = e;
                            break;
                        }
                        p = e;
                    } while((e = e.next) ! =null); }}//node ! = null indicates that a node corresponding to the key exists
            // If matchValue is false, then the node can be removed directly
            // If matchValue is true, this node needs to be removed only when values are equal
            if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
                if (node instanceof TreeNode) / / the red-black tree
                    ((TreeNode<K, V>) node).removeTreeNode(this, tab, movable);
                else if (node == p) // If the key is equal to the header, move the pointer down one bit
                    tab[index] = node.next;
                else / / list
                    p.next = node.next;
                ++modCount;
                --size;
                // Used for LinkedHashMap, null implementation in HashMap
                afterNodeRemoval(node);
                returnnode; }}return null;
    }
Copy the code

9. Hash algorithm

When inserting, querying, and removing key-value pairs, locating the corresponding positions in the hash bucket array is a crucial first step. Only when the elements in the HashMap are evenly distributed, can each position in the array be saved as much as possible, avoiding frequent construction and traversal of linked lists or red-black trees. And that depends on a good hash algorithm

Here’s how to compute the hash value of a key in a HashMap and get its position in the hash bucket array based on the hash value

	static final int hash(Object key) {
        int h;
        // the high level participates in the operation
        return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
    }

    // Get Value based on key
    public V get(Object key) {
        Node<K, V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

	// find the specified node
    final Node<K, V> getNode(int hash, Object key) {...// Element values are available only if the table is not empty and the hash position is not null
        if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null{...})return null;
    }
Copy the code

(h = key.hashcode ()) ^ (h >>> 16), which can be broken down into three steps:

  • Get the hashCode of the key by key.hashcode (), which is h
  • Migrate the high 16 bits of H to the low 16 bits by h >>> 16, and all the high 16 bits become 0
  • Xor operation is performed on the values obtained in the above two steps. The highest 16 bits of the final result are the same as the highest 16 bits of H, and the lowest 16 bits are the xOR operation results of the highest 16 bits of H and the lowest 16 bits of H

The index of the key’s position in the hash bucket array is computed by (n-1) &hash, where n is the capacity of the hash bucket array. HashMap requires that the hash bucket array be a power of 2, i.e., n is 16, 32, 64, 128, and the corresponding binary bits of n-1 are:

  • N is 16, n minus 1 is 1111
  • N is 32. N minus 1 is 11111
  • N is 64, n minus 1 is 111111
  • N is 128, n minus 1 is 1111111

As you can see, no matter what the hash value is, the size of the index computed by (n-1) & will not be larger than n itself, greater than or equal to 0 and less than or equal to n minus 1, which is also consistent with our range of array indexes. In addition, the generation rule of hash value uses both the high and low 16 bits of hashCode, which increases the randomness on the basis of hashCode, making the final index value calculated by (n-1) & hash more randomness. Thus, the elements can be evenly distributed in the hash bucket array and the probability of hash conflict is reduced

10,

If the hash bucket array is large, even with a bad hash algorithm, the elements will be scattered; If the hash bucket array is very small, even a good hash algorithm will have more collisions than doha, so it needs to balance between space cost and time cost. In addition to designing a good hash algorithm to reduce the hash conflict, it also needs to expand the hash bucket array at the appropriate time

As the number of elements in a HashMap increases and the array capacity is fixed, the probability of hash conflicts will also increase. In order to improve efficiency, it is necessary to expand the array in a HashMap. The data in the original array must be recalculated and migrated to the new array

When will the HashMap expansion operation be triggered? Array expansion occurs when the number of elements in a HashMap exceeds the threshold (array capacity multiplied by loadFactor). For example, if the current size of the array is 16 and the loadFactor value is 0.75, then when the number of elements in the HashMap reaches 12, the expansion operation is automatically triggered, doubling the size of the array to 2 * 16 = 32, and then recalculating the position of each element in the new array. This is a very performance-intensive operation, so if you already know how much data to store in a HashMap, it is more efficient to specify the initialization size directly when initializing a HashMap

By default, the hash array size is 16 and the loadFactor is 0.75, which is the result of balancing space utilization and time efficiency

The operations that initialize and expand arrays correspond to the resize() method

	final Node<K, V>[] resize() {
        // Array before expansion
        Node<K, V>[] oldTab = table;
        // The size of the array before expanding
        int oldCap = (oldTab == null)?0 : oldTab.length;
        // Current capacity expansion threshold
        int oldThr = threshold;
        // Array capacity and threshold after expansion
        int newCap, newThr = 0;
        if (oldCap > 0) { 
            //oldCap > 0 indicates that the table has been initialized to determine whether capacity expansion is required
            
            // If the array has reached its maximum capacity, the system does not expand the array and increases the threshold to integer.max_value
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) {
                // If you double the existing capacity of the array, it is still less than MAXIMUM_CAPACITY, and the existing capacity is greater than or equal to DEFAULT_INITIAL_CAPACITY
                // Double the size of the array and the threshold
                newThr = oldThr << 1;
            } 
            
            // There should be another case here
            // Increase the existing capacity of the array to twice as much as MAXIMUM_CAPACITY
            // newThr = 0, newCap = double oldCap
            // The value of newCap is not restored, indicating that the HashMap capacity is allowed to exceed MAXIMUM_CAPACITY
            // If the existing capacity exceeds MAXIMUM_CAPACITY, the capacity cannot be expanded again
        } else if (oldThr > 0) { 
            //oldCap <= 0 && oldThr > 0
            // The table is not initialized yet, and initialCapacity or Map was passed when the constructor was called
            // In this case, directly increase the capacity to threshold and recalculate the new capacity expansion threshold later
            newCap = oldThr;
        } else { 
            //oldCap <= 0 && oldThr <= 0
            // The table is not initialized yet and the no-argument constructor is called
            // Expand the table capacity to the default size and use the default load factor to calculate the threshold
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float) newCap * loadFactor;
            // Calculate the new threshold after capacity expansion
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ? (int) ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes", "unchecked"})
        Node<K, V>[] newTab = (Node<K, V>[]) new Node[newCap];
        table = newTab;
        // If there are values in the old array, the data needs to be copied to the new array
        if(oldTab ! =null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K, V> e;
                if((e = oldTab[j]) ! =null) {
                    oldTab[j] = null;
                    // e.ext == null indicates that element e does not have a hash conflict, so the element can be transferred directly
                    if (e.next == null)
                        // Calculates the position of element e in the new array
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode) // Hash conflicts exist and red black trees are used
                        ((TreeNode<K, V>) e).split(this, newTab, j, oldCap);
                    else { // Hash conflicts exist and linked lists are used
                        Node<K, V> loHead = null, loTail = null;
                        Node<K, V> hiHead = null, hiTail = null;
                        Node<K, V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            } else {
                                if (hiTail == null)
                                    hiHead = e;
                                elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
                        if(loTail ! =null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if(hiTail ! =null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }
Copy the code

11. Efficiency test

Here we test the effect of different initialization sizes and different hashCode values on the running efficiency of HashMap

The hashCode() method returns its value property directly

public class Key {

    private int value;

    public Key(int value) {
        this.value = value;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null|| getClass() ! = o.getClass())return false;
        Key key = (Key) o;
        return value == key.value;
    }

    @Override
    public int hashCode(a) {
        returnvalue; }}Copy the code

The initial size increases by a factor of 10 from 200 to 200000, stores the same amount of data into different HashMaps, and observe how long it takes to store the data

public class Test {

    private static final int MAX_KEY = 20000;

    private static final Key[] KEYS = new Key[MAX_KEY];

    static {
        for (int i = 0; i < MAX_KEY; i++) {
            KEYS[i] = newKey(i); }}private static void test(int size) {
        long startTime = System.currentTimeMillis();
        Map<Key, Integer> map = new HashMap<>(size);
        for (int i = 0; i < MAX_KEY; i++) {
            map.put(KEYS[i], i);
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Initial size is:" + size + ", time:" + (endTime - startTime) + "毫秒");
    }

    public static void main(String[] args) {
        for (int i = 20; i <= MAX_KEY; i *= 10) { test(i); }}}Copy the code

In the above example, the hash values of each Key object are different, so the distribution of key-value pairs in the hash bucket array can be said to be very uniform. In this case, the capacity expansion mechanism is the main influence on the performance of the HashMap. It can be seen from the log that different initialization sizes at this time have little impact on the performance of the HashMap

The initial size is:20, available:4The millisecond initialization size is:200, available:3The millisecond initialization size is:2000, available:4The millisecond initialization size is:20000, available:2msCopy the code

If the hashCode() method of the Key class is fixed to return 100, then every Key object must have hash collisions in the presence of a HashMap

    @Override
    public int hashCode(a) {
        return 100;
    }
Copy the code

It can be seen that the time required to store the same amount of data increases geometrically, indicating that a large number of hash conflicts will have a great impact on the HashMap

The initial size is:20, available:2056The millisecond initialization size is:200, available:1902The millisecond initialization size is:2000, available:1892The millisecond initialization size is:20000, available:1865msCopy the code

Second, the LinkedHashMap

A HashMap does not guarantee that the order in which elements are stored and iterated is the same as the order in which they are stored, that is, the HashMap itself is unordered. To solve this problem, Java provides LinkedHashMap to implement an ordered HashMap

Class declaration

LinkedHashMap, a subclass of HashMap, preserves the insertion order of elements. It maintains a linked list of elements by insertion order or access order, by default, as with ArrayList. The LRUcache caching algorithm can be implemented by ordering elements in the order in which they are accessed, which moves the elements to the end of the list after each access

	public class LinkedHashMap<K.V> extends HashMap<K.V> 
        implements Map<K.V>
Copy the code

2. Node classes

Each stored key-value pair in a HashMap is wrapped as a Node object, while a LinkedHashMap is wrapped as an Entry object, as you can see from the newNode method. The Entry class extends the Node class with two new member variables: before and After, which are key to the LinkedHashMap’s ordered access. Each time a new key-value pair is saved, Entry retains the order of the key-value pair by concatenating it with the previous key-value pair via these two variables as the end of the linked list

Whether Entry uses a linked list or a red-black tree to resolve hash conflicts within the HashMap, the pointing of both variables is unaffected by changes in the data structure. This also points to a clever design aspect of the collection framework: instead of creating a new LinkedHashMap to maintain the insertion order of elements, it extends its functionality by extending its parent class

	static class Entry<K.V> extends HashMap.Node<K.V> {
        // use this to specify before and after
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next); }}Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p = new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);
        return p;
    }

    /** * The head (eldest) of the doubly linked list. */
    transient LinkedHashMap.Entry<K,V> head;

    /** * The tail (youngest) of the doubly linked list. */
    transient LinkedHashMap.Entry<K,V> tail;

    // link at the end of list
    private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
        if (last == null)
            head = p;
        else{ p.before = last; last.after = p; }}Copy the code

3, variables,

The accessOrder variable is used to determine how the elements in the LinkedHashMap are sorted, by the order in which they were accessed if true, or by the order in which they were inserted if false

    // serialize the ID
    private static final long serialVersionUID = 3801124242820219131L;

    // point to the head of the bidirectional list
    transient LinkedHashMap.Entry<K,V> head;

    // points to the most recently accessed node
    transient LinkedHashMap.Entry<K,V> tail;

    final boolean accessOrder;
Copy the code

constructor

By default, linkedhashMaps are sorted by element insertion order

	public LinkedHashMap(int initialCapacity, float loadFactor) {
        super(initialCapacity, loadFactor);
        accessOrder = false;
    }

    public LinkedHashMap(int initialCapacity) {
        super(initialCapacity);
        accessOrder = false;
    }

    public LinkedHashMap(a) {
        super(a); accessOrder =false;
    }

    public LinkedHashMap(Map<? extends K, ? extends V> m) {
        super(a); accessOrder =false;
        putMapEntries(m, false);
    }

    public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) {
        super(initialCapacity, loadFactor);
        this.accessOrder = accessOrder;
    }
Copy the code

5, the method of reservation

There are three empty methods reserved for HashMap, and the source comments indicate that these three functions are reserved for LinkedHashMap

    // Callbacks to allow LinkedHashMap post-actions
    void afterNodeAccess(Node<K,V> p) {}void afterNodeInsertion(boolean evict) {}void afterNodeRemoval(Node<K,V> p) {}Copy the code

AfterNodeAccess is called when a node in the HashMap is accessed (for example, when the GET method is called) and accessOrder is true. AfterNodeAccess is used to move the newly accessed key-value pair to the end of the list. This operation is quite lightweight, since it only takes a few references to change a node’s position in a linked list

	public V get(Object key) {
        Node<K,V> e;
        if ((e = getNode(hash(key), key)) == null)
            return null;
        if (accessOrder)
            afterNodeAccess(e);
        return e.value;
    }

	// called when node e is accessed
    // set node e to the end of the list
    void afterNodeAccess(Node<K,V> e) {
        // Last refers to the end of the list
        LinkedHashMap.Entry<K,V> last;
        The next step is required only if last and e are not equal. If they are equal, e is already at the end of the list
        if(accessOrder && (last = tail) ! = e) { LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;// After is null because p will be the tail
            p.after = null;
            // if b == null, p is now the head of the list, and a becomes the new head
            // If b! = null, remove node B's reference to node P and concatenate node A
            if (b == null)
                head = a;
            else
                b.after = a;
            // If a! = null, indicating that node P is not the end node of the linked list at this time, remove node A's reference to node P and connect it with b in series
            // If a == null, p is now the end of the list, and a becomes the new end
            if(a ! =null)
                a.before = b;
            else
                last = b;
            // If last == null, the list is null, and the head node is p
            // If last! = null, p becomes the new tail
            if (last == null)
                head = p;
            else {
                p.before = last;
                last.after = p;
            }
            // The last node referenced is tailtail = p; ++modCount; }}Copy the code

When the PUT method is called, the afterNodeInsertion method, which determines whether to remove the least recently used element, is called to build the LRUcache cache

    This method can be used to remove the least recently used element from the LRUcache algorithm
    void afterNodeInsertion(boolean evict) {
        LinkedHashMap.Entry<K,V> first;
        if(evict && (first = head) ! =null && removeEldestEntry(first)) {
            K key = first.key;
            removeNode(hash(key), key, null.false.true); }}This method is used to determine whether to remove the oldest cache. The default return is false
	// This method can be overridden to remove old data according to specific rules
    protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
        return false;
    }
Copy the code

When a node is removed from the HashMap, the LinkedHashMap removes the reference to the node from the maintained list by visting deremoval

    // called after node e is removed
    void afterNodeRemoval(Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
        // remove the reference of node P to neighboring nodes
        p.before = p.after = null;
        // if b == null, p is the head of the list, and a will be the new head
        // If b! = null, the reference between nodes is updated
        if (b == null)
            head = a;
        else
            b.after = a;
        // If a == null, node A is the tail node, and the last node accessed after node P is removed is the second-to-last node
        // If a! = null, the reference between nodes is updated
        if (a == null)
            tail = b;
        else
            a.before = b;
    }
Copy the code

6, LRUCache

In Android application development, LRUCache algorithm (least recently used algorithm) is very common. A typical use is to cache Bitmap in memory, because reading Bitmap from IO stream consumes a lot of resources. In order to prevent multiple reading of an image from disk, So it’s usually Bitmap in memory. However, memory space is also limited, so it is not possible to cache every image, and a certain number of images need to be selectively cached. LRUCache is one of the most common cache schemes

This takes advantage of the LinkedHashMap’s ability to arrange elements in the order in which they are used to implement a cache of LRUCache policies

public class LRUCache {

    private static class LRUCacheMap<K.V> extends LinkedHashMap<K.V> {

        // Maximum number of caches
        private final int maxCacheSize;

        public LRUCacheMap(int maxCacheSize) {
            super(16.0.75 F.true);
            this.maxCacheSize = maxCacheSize;
        }

        @Override
        protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
            returnsize() > maxCacheSize; }}public static void main(String[] args) {
        // The maximum number of buffers is 5
        LRUCacheMap<String, Integer> map = new LRUCacheMap<>(5);
        map.put("Java".1);
        map.put("Jetpack".2);
        map.put("Kotlin".3);
        map.put("Yip Chi Chan".4);
        map.put("Byte array".5);
        map.put("leaveC".6);

        System.out.println();
        Set<String> keySet = map.keySet();
        // The output is: Jetpack Kotlin leaveC
        keySet.forEach(key -> System.out.print(key + ""));

        // Get the value of the head node of the list and move it to the end of the list
        map.get("Jetpack");
        System.out.println();
        keySet = map.keySet();
        // The output is: Kotlin leaveC Jetpack
        keySet.forEach(key -> System.out.print(key + ""));

        // Add elements to the list
        map.put("Dart".5);
        System.out.println();
        LeaveC Jetpack Dart leaveC Jetpack Dart
        keySet.forEach(key -> System.out.print(key + "")); }}Copy the code

Third, HashSet

HashSet implements the Set interface, does not allow the insertion of duplicate elements, allows the inclusion of null elements, and does not guarantee the iteration order of elements. Looking at HashSet after the above parsing of the HashMap source code has a “just so” feel to it

As we know, when inserting a key-value pair with the same key into a HashMap, the old key in the HashMap will not be changed, but the old value may be overwritten by the new value. The HashSet relies on this feature to achieve its own unrepeatability. A HashSet contains a HashMap, and any value added to the HashSet is wrapped as a key-value pair and stored in the HashMap. The key is the external value, and the value is provided by the HashSet. If the key is not repeated, it is saved normally. When the key is repeated, only the value is changed, thus implementing the non-repeating feature of HashSet elements

Post the source code directly here

public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable.java.io.Serializable{

    static final long serialVersionUID = -5024744406713321676L;

    // Use HashMap to store data
    // The Key Value is passed in externally, and the Value is maintained inside the HashSet
    private transient HashMap<E,Object> map;

    // All key-value pairs in the HashMap share the same value
    // All key-value pairs stored in the HashMap use this object as their value
    private static final Object PRESENT = new Object();

    public HashSet(a) {
        map = new HashMap<>();
    }

    // Use the default load factor to calculate the initial size of the HashMap
    //+1 is used to compensate for the loss of accuracy
    public HashSet(Collection<? extends E> c) {
        map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
        addAll(c);
    }

    public HashSet(int initialCapacity, float loadFactor) {
        map = new HashMap<>(initialCapacity, loadFactor);
    }

    public HashSet(int initialCapacity) {
        map = new HashMap<>(initialCapacity);
    }

    // This constructor is package access and is only used to support LinkedHashSet
    HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = new LinkedHashMap<>(initialCapacity, loadFactor);
    }

    // Convert an iteration over a HashSet to an iteration over a HashMap Key
    public Iterator<E> iterator(a) {
        return map.keySet().iterator();
    }

    public int size(a) {
        return map.size();
    }

    public boolean isEmpty(a) {
        return map.isEmpty();
    }

    public boolean contains(Object o) {
        return map.containsKey(o);
    }

    // If the HashMap does not contain a key-value pair whose key is e, add the element and return true
    // If it does, only value is overwritten without affecting key, and false is returned
    // To implement the non-repeating feature of HashSet keys
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

    public boolean remove(Object o) {
        return map.remove(o)==PRESENT;
    }

    public void clear(a) { map.clear(); }}Copy the code

Four, LinkedHashSet

LinkedHashSet, whose internal source code is simple to a few dozen lines, is a subclass of HashSet, as you can guess from its name, and relies on lists to implement ordered Hashsets

HashSet reserves a constructor for LinkedHashSet, whose dummy parameter has no real meaning, just to distinguish it from other constructors. Other constructors initialize the map variable to be of type HashMap. Reserved constructors initialize the LinkedHashMap variable to be of type HashMap, so that the LinkedHashSet can be accessed by the two-way list inside the LinkedHashMap. Element unique properties

public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable.java.io.Serializable {

	private transient HashMap<E,Object> map;
    
    HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = newLinkedHashMap<>(initialCapacity, loadFactor); }}public class LinkedHashSet<E> extends HashSet<E> implements Set<E>, Cloneable.java.io.Serializable {

    private static final long serialVersionUID = -2851667679971038690L;

    public LinkedHashSet(int initialCapacity, float loadFactor) {
        super(initialCapacity, loadFactor, true);
    }

    public LinkedHashSet(int initialCapacity) {
        super(initialCapacity, .75f.true);
    }

    // Use the default initial capacity and load factor
    public LinkedHashSet(a) {
        super(16..75f.true);
    }

    public LinkedHashSet(Collection<? extends E> c) {
        super(Math.max(2*c.size(), 11), .75f.true);
        addAll(c);
    }
    
    @Override
    public Spliterator<E> spliterator(a) {
        return Spliterators.spliterator(this, Spliterator.DISTINCT | Spliterator.ORDERED); }}Copy the code