Public number: byte array, hope to help you 🤣🤣
This series of articles will be on the Java and Android collection framework (JDK 1.8, Android SDK 30) in several common containers combined with the source code to introduce, understand different containers in the data structure, application scenarios, advantages of the different points, I hope to help you 🤣🤣
A, a HashMap
A HashMap is a data type used to store key-value pairs. It is an asynchronous implementation of the Map interface based on a hash table. The key can be null, duplicate keys are not allowed to be inserted, and duplicate values are allowed
A HashMap is actually a combination of an array, a linked list, and a red-black tree. The underlying element contains an array in which each element has four possible types: NULL, a single node, a linked list, and a red-black tree (JDK1.8 started using red-black trees to improve element lookup). When the deposit to HashMap elements, according to the key’s hash value is the first element in the array position (i.e., the array subscript), if there are other elements have been stored on the location, so in the position of the element will be in the form of a list or a red-black tree to store, if there is no element, the location is directly to the location to store elements. Therefore, a HashMap requires that the key be immutable, that is, the hash value of the key cannot change, otherwise it will not be able to locate its location during subsequent accesses
1, the hash
A Hash, commonly translated as a Hash, is a Hash algorithm that transforms any input object into a fixed length output, which is the Hash value. Different inputs may be hashed to the same output, so it is not possible to determine a unique input value from the hash value, but you can use the hash value as a feature of this object
The role of hashing can be illustrated by an example. Let’s say you have a thousand words and you need to find the location index of the word “hello.” The most intuitive thing to do is to store the words in an array of a thousand length and iterate over them. The worst thing you can do is iterate over them a thousand times. The more words you have, the more array space you need and the higher the average number of traversals you need. In order to save memory space and reduce the number of iterations, we can use the hash algorithm to take the hash value of each word, map the hash value to an index value in an array of length of one hundred, and save the corresponding word in the index position. If the hash algorithm used is good enough, the hash values of different words can be very random, so that a thousand words can be evenly distributed in the array. The best case scenario is to hold only ten words in each array position, and these ten words can be concatenated in a linked list or other data structure. In this way, we only need to calculate the index value of “hello” and then traverse 10 words in that index position. If the array space is large enough and the hashing algorithm results in a sufficiently uniform index value, then at best only one lookup is needed to get the target result, and at worst only all words in that position are searched, greatly reducing the number of traversals
Inside a HashMap, hash algorithms are used to store elements. However, since the hash algorithm may hash the same output for different inputs, and the array space cannot be infinite, it is inevitable to store multiple elements in the same array location, which is called hash conflict. In addition, a HashMap does not guarantee the order in which elements are stored or iterated, because the HashMap rehashes the elements as needed, and the order of the elements is reshuffled, so the storage order and iteration order can change at different times. In addition, HashMap is not thread-safe, and multiple threads writing at the same time can result in data corruption and even thread deadlocks
Class declaration
public class HashMap<K.V> extends AbstractMap<K.V>
implements Map<K.V>, Cloneable.Serializable
Copy the code
3, constant
The global constants in a HashMap focus on the following
// Hash the default capacity of the bucket array
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;
// The maximum size of the hash bucket array
static final int MAXIMUM_CAPACITY = 1 << 30;
// Load factor
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
// For efficiency, when the length of the list exceeds this value, the list is converted to a red-black tree
static final int TREEIFY_THRESHOLD = 8;
// When the length of the red-black tree is less than this value, the red-black tree is converted to a linked list
static final int UNTREEIFY_THRESHOLD = 6;
Copy the code
The load factor is used to specify the maximum proportion of data that occupies the capacity of the array before it is automatically expanded, that is, when the amount of data occupies the capacity of the array reaches this proportion, the array will be automatically expanded. The load factor measures how much space is used in a hash table. A larger load factor indicates a higher degree of loading of the hash table, and vice versa. For hash tables that use linked lists, the average time to find an element is O(1+a), so the larger the load factor, the higher the space utilization, and correspondingly the lower the search efficiency. If the load factor is too small, the data in the array will be too sparse, the utilization of space will be low, and the corresponding search efficiency will be improved
The official default load factor size is DEFAULT_LOAD_FACTOR, or 0.75, which is the result of balancing space utilization with lookup efficiency. In practice, if the memory space is large and the time efficiency is high, you can choose to reduce the load factor size; If memory space is tight and time efficiency is not high, you can choose to increase the load factor
In addition, even if the load factor and hash algorithm are well designed, it is inevitable that the list length will be too long due to hash conflicts, which will affect the performance of the HashMap. In order to optimize the performance, red black tree is introduced from JDK1.8. When the length of the list exceeds the value specified by TREEIFY_THRESHOLD, the list will be converted to red black tree. The feature of red black tree can be used to improve the performance of HashMap
4, variables,
// Hash bucket array, initialized when first used
// The capacity value should be an integer multiple of 2
transient Node<K, V>[] table;
/** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */
transient Set<Map.Entry<K, V>> entrySet;
/ / the size of the Map
transient int size;
// This parameter is incremented each time the Map structure changes
// When iterating over a Map, the iterator checks this value
// If the value of this parameter changes, it indicates that the Map structure has changed during the iteration, so an exception will be thrown directly
transient int modCount;
// The size of the array will be expanded when the size of the array reaches this value
// Calculate method: current capacity x load factor
int threshold;
// The load factor value to use
final float loadFactor;
Copy the code
constructors
// Set the Map's initial size and load factor
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
this.loadFactor = loadFactor;
this.threshold = tableSizeFor(initialCapacity);
}
// Set the initialization size
public HashMap(int initialCapacity) {
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
// Use the default value
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
}
// Pass in initial data
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
putMapEntries(m, false);
}
Copy the code
6. Insert key-value pairs
As mentioned above, a HashMap is a combination of an array + a linked list + a red-black tree. Each element in the array has four possible types: NULL, a single node, a linked list, and a red-black tree
Each key-value pair to be inserted is wrapped as a Node object, and the Node object’s position in the array is determined by the hash of the key. If the computed location does not contain a value, simply place the Node object at that location; If a value is included, a hash collision has occurred, and the Node object needs to be inserted into a linked list or red-black tree. If the key is equal to the key of an existing Node in a linked list or red-black tree (the hash value is equal and equals equals equals), the new Node object overwrites the original data
When the calculation results of hash algorithm are more dispersed and uniform, the probability of hash collision is smaller, and the access efficiency of HashMap will be higher
The Node class is declared as follows
static class Node<K.V> implements Map.Entry<K.V> {
// Hash of key
final int hash;
final K key;
V value;
// Next node
Node<K,V> next;
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey(a) { return key; }
public final V getValue(a) { return value; }
public final String toString(a) { return key + "=" + value; }
public final int hashCode(a) {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceofMap.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false; }}Copy the code
The way to insert a key-value pair is put(K key, V value)
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
// Calculate the hash value of key
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
The putVal method is complicated because it takes into account the following situations:
- If the table is not initialized or its capacity is 0, initialize and expand the table
- Determine if there is a hash conflict
- If there are no hash conflicts, the key-value pair is stored directly to the computed location
- If there is a hash conflict, the key-value pair is added to the red-black tree or linked list at that location, and the list is converted to a red-black tree when the chain expression reaches its maximum length
- If there are nodes with the same key, determine whether to overwrite the old value
- Reserve a burial point for the LinkedHashMap method
- After the key-value pair is saved, perform the necessary expansion
/ * * *@param hash hash for key
* @param key the key
* @param value the value to put
* @paramIf onlyIfAbsent is true, non-null values with the same key will not be overwritten. Otherwise, the original value * will be overwritten@param evict if false, the table is in creation mode.
* @return previous value, or null if none
*/
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
Node<K, V>[] tab;
Node<K, V> p;
int n, i;
// If the table is not initialized or has a capacity of 0, call resize to initialize it
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// Determine whether the key to be stored has a hash conflict
//p points to the location in the array where the key-value pair is expected to be stored
// if p is null, there is no conflict
if ((p = tab[i = (n - 1) & hash]) == null)
// Build the node containing the element to be stored directly at index I
tab[i] = newNode(hash, key, value, null);
else { // Enter this branch, indicating that the key to be stored has a hash conflict
Node<K, V> e;
K k;
// The p value was assigned in the previous if statement, so we can determine the equality of Node keys directly
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
// will walk in here, indicating that the key of node P is equal to the key pair to be stored
// This position may have only one node, or it may be a red-black tree or a linked list,
// then e refers to the conflict node
// Now you have found the location where the key is stored
e = p;
// If the Node key is not equal and the head Node is of type TreeNode, the current position uses red-black tree to handle hash collisions
else if (p instanceof TreeNode)
// If the same key does not exist in the tree, insert and return null; otherwise, do not save and return the node with the same key
e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
else { // This location currently uses linked lists to handle hash collisions
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
// will enter here, indicating that the end of the list is traversed, and every node in the list has an unequal key
// Add it to the end of the list
p.next = newNode(hash, key, value, null);
// If the list length has reached the maximum allowed length, then the list is converted to a red-black tree
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
// find the node with the same key, e
break; p = e; }}// If e! = null: indicates that the key-value pair with the same key exists
// If value needs to be overwritten
if(e ! =null) {
V oldValue = e.value;
// If onlyIfAbsent is false or oldValue is null, the original value is overwritten
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
// Used for LinkedHashMap, null implementation in HashMap
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
// Determine whether capacity expansion is required
if (++size > threshold)
resize();
// Used for LinkedHashMap, null implementation in HashMap
afterNodeInsertion(evict);
return null;
}
Copy the code
7. Obtain value
The corresponding method to obtain a value is get(Object key)
public V get(Object key) {
Node<K, V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
// Get the node by key
final Node<K, V> getNode(int hash, Object key) {
Node<K, V>[] tab;
Node<K, V> first, e;
int n;
K k;
// The key can exist only when the table is not empty and the hash position is not null
if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null) {
if(first.hash == hash && ((k = first.key) == key || (key ! =null && key.equals(k))))
// If the value is equal to the header, the corresponding value is found
return first;
// e ! = null indicates either a linked list or a red-black tree exists at the location, so get from either
if((e = first.next) ! =null) {
if (first instanceof TreeNode) / / the red-black tree
return ((TreeNode<K, V>) first).getTreeNode(hash, key);
do { / / list
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
return e;
} while((e = e.next) ! =null); }}return null;
}
Copy the code
8. Remove the node
Removing key-value pairs from a Map represents removing a reference to a Node object from its underlying data structure, which could be an array, red-black tree, or linked list
// If the key exists, return the corresponding value, otherwise return null
public V remove(Object key) {
Node<K, V> e;
return (e = removeNode(hash(key), key, null.false.true)) = =null ?
null : e.value;
}
/ * * *@paramValue Key Specifies the value corresponding to the key. This value is required only when matchValue is true. Otherwise, the value * is ignored@paramIf matchValue is true, the node will be removed only if the key and value match, otherwise the element will be removed as long as the key is equal@param movable if false do not move other nodes while removing
* @return the node, or null if none
*/
final Node<K, V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable) {
Node<K, V>[] tab;
Node<K, V> p;
int n, index;
// The key can exist only when the table is not empty and the hash position is not null
if((tab = table) ! =null && (n = tab.length) > 0 && (p = tab[index = (n - 1) & hash]) ! =null) {
Node<K, V> node = null, e;
K k;
V v;
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
// If the key is equal to the head p, then the target node has been found
node = p;
else if((e = p.next) ! =null) { // There are red black trees or linked lists
if (p instanceof TreeNode) / / the red-black tree
node = ((TreeNode<K, V>) p).getTreeNode(hash, key);
else { / / list
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
node = e;
break;
}
p = e;
} while((e = e.next) ! =null); }}//node ! = null indicates that a node corresponding to the key exists
// If matchValue is false, then the node can be removed directly
// If matchValue is true, this node needs to be removed only when values are equal
if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
if (node instanceof TreeNode) / / the red-black tree
((TreeNode<K, V>) node).removeTreeNode(this, tab, movable);
else if (node == p) // If the key is equal to the header, move the pointer down one bit
tab[index] = node.next;
else / / list
p.next = node.next;
++modCount;
--size;
// Used for LinkedHashMap, null implementation in HashMap
afterNodeRemoval(node);
returnnode; }}return null;
}
Copy the code
9. Hash algorithm
When inserting, querying, and removing key-value pairs, locating the corresponding positions in the hash bucket array is a crucial first step. Only when the elements in the HashMap are evenly distributed, can each position in the array be saved as much as possible, avoiding frequent construction and traversal of linked lists or red-black trees. And that depends on a good hash algorithm
Here’s how to compute the hash value of a key in a HashMap and get its position in the hash bucket array based on the hash value
static final int hash(Object key) {
int h;
// the high level participates in the operation
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
// Get Value based on key
public V get(Object key) {
Node<K, V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
// find the specified node
final Node<K, V> getNode(int hash, Object key) {...// Element values are available only if the table is not empty and the hash position is not null
if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null{...})return null;
}
Copy the code
(h = key.hashcode ()) ^ (h >>> 16), which can be broken down into three steps:
- Get the hashCode of the key by key.hashcode (), which is h
- Migrate the high 16 bits of H to the low 16 bits by h >>> 16, and all the high 16 bits become 0
- Xor operation is performed on the values obtained in the above two steps. The highest 16 bits of the final result are the same as the highest 16 bits of H, and the lowest 16 bits are the xOR operation results of the highest 16 bits of H and the lowest 16 bits of H
The index of the key’s position in the hash bucket array is computed by (n-1) &hash, where n is the capacity of the hash bucket array. HashMap requires that the hash bucket array be a power of 2, i.e., n is 16, 32, 64, 128, and the corresponding binary bits of n-1 are:
- N is 16, n minus 1 is 1111
- N is 32. N minus 1 is 11111
- N is 64, n minus 1 is 111111
- N is 128, n minus 1 is 1111111
As you can see, no matter what the hash value is, the size of the index computed by (n-1) & will not be larger than n itself, greater than or equal to 0 and less than or equal to n minus 1, which is also consistent with our range of array indexes. In addition, the generation rule of hash value uses both the high and low 16 bits of hashCode, which increases the randomness on the basis of hashCode, making the final index value calculated by (n-1) & hash more randomness. Thus, the elements can be evenly distributed in the hash bucket array and the probability of hash conflict is reduced
10,
If the hash bucket array is large, even with a bad hash algorithm, the elements will be scattered; If the hash bucket array is very small, even a good hash algorithm will have more collisions than doha, so it needs to balance between space cost and time cost. In addition to designing a good hash algorithm to reduce the hash conflict, it also needs to expand the hash bucket array at the appropriate time
As the number of elements in a HashMap increases and the array capacity is fixed, the probability of hash conflicts will also increase. In order to improve efficiency, it is necessary to expand the array in a HashMap. The data in the original array must be recalculated and migrated to the new array
When will the HashMap expansion operation be triggered? Array expansion occurs when the number of elements in a HashMap exceeds the threshold (array capacity multiplied by loadFactor). For example, if the current size of the array is 16 and the loadFactor value is 0.75, then when the number of elements in the HashMap reaches 12, the expansion operation is automatically triggered, doubling the size of the array to 2 * 16 = 32, and then recalculating the position of each element in the new array. This is a very performance-intensive operation, so if you already know how much data to store in a HashMap, it is more efficient to specify the initialization size directly when initializing a HashMap
By default, the hash array size is 16 and the loadFactor is 0.75, which is the result of balancing space utilization and time efficiency
The operations that initialize and expand arrays correspond to the resize() method
final Node<K, V>[] resize() {
// Array before expansion
Node<K, V>[] oldTab = table;
// The size of the array before expanding
int oldCap = (oldTab == null)?0 : oldTab.length;
// Current capacity expansion threshold
int oldThr = threshold;
// Array capacity and threshold after expansion
int newCap, newThr = 0;
if (oldCap > 0) {
//oldCap > 0 indicates that the table has been initialized to determine whether capacity expansion is required
// If the array has reached its maximum capacity, the system does not expand the array and increases the threshold to integer.max_value
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
} else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY) {
// If you double the existing capacity of the array, it is still less than MAXIMUM_CAPACITY, and the existing capacity is greater than or equal to DEFAULT_INITIAL_CAPACITY
// Double the size of the array and the threshold
newThr = oldThr << 1;
}
// There should be another case here
// Increase the existing capacity of the array to twice as much as MAXIMUM_CAPACITY
// newThr = 0, newCap = double oldCap
// The value of newCap is not restored, indicating that the HashMap capacity is allowed to exceed MAXIMUM_CAPACITY
// If the existing capacity exceeds MAXIMUM_CAPACITY, the capacity cannot be expanded again
} else if (oldThr > 0) {
//oldCap <= 0 && oldThr > 0
// The table is not initialized yet, and initialCapacity or Map was passed when the constructor was called
// In this case, directly increase the capacity to threshold and recalculate the new capacity expansion threshold later
newCap = oldThr;
} else {
//oldCap <= 0 && oldThr <= 0
// The table is not initialized yet and the no-argument constructor is called
// Expand the table capacity to the default size and use the default load factor to calculate the threshold
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
float ft = (float) newCap * loadFactor;
// Calculate the new threshold after capacity expansion
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ? (int) ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes", "unchecked"})
Node<K, V>[] newTab = (Node<K, V>[]) new Node[newCap];
table = newTab;
// If there are values in the old array, the data needs to be copied to the new array
if(oldTab ! =null) {
for (int j = 0; j < oldCap; ++j) {
Node<K, V> e;
if((e = oldTab[j]) ! =null) {
oldTab[j] = null;
// e.ext == null indicates that element e does not have a hash conflict, so the element can be transferred directly
if (e.next == null)
// Calculates the position of element e in the new array
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode) // Hash conflicts exist and red black trees are used
((TreeNode<K, V>) e).split(this, newTab, j, oldCap);
else { // Hash conflicts exist and linked lists are used
Node<K, V> loHead = null, loTail = null;
Node<K, V> hiHead = null, hiTail = null;
Node<K, V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
} else {
if (hiTail == null)
hiHead = e;
elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
if(hiTail ! =null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
Copy the code
11. Efficiency test
Here we test the effect of different initialization sizes and different hashCode values on the running efficiency of HashMap
The hashCode() method returns its value property directly
public class Key {
private int value;
public Key(int value) {
this.value = value;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null|| getClass() ! = o.getClass())return false;
Key key = (Key) o;
return value == key.value;
}
@Override
public int hashCode(a) {
returnvalue; }}Copy the code
The initial size increases by a factor of 10 from 200 to 200000, stores the same amount of data into different HashMaps, and observe how long it takes to store the data
public class Test {
private static final int MAX_KEY = 20000;
private static final Key[] KEYS = new Key[MAX_KEY];
static {
for (int i = 0; i < MAX_KEY; i++) {
KEYS[i] = newKey(i); }}private static void test(int size) {
long startTime = System.currentTimeMillis();
Map<Key, Integer> map = new HashMap<>(size);
for (int i = 0; i < MAX_KEY; i++) {
map.put(KEYS[i], i);
}
long endTime = System.currentTimeMillis();
System.out.println("Initial size is:" + size + ", time:" + (endTime - startTime) + "毫秒");
}
public static void main(String[] args) {
for (int i = 20; i <= MAX_KEY; i *= 10) { test(i); }}}Copy the code
In the above example, the hash values of each Key object are different, so the distribution of key-value pairs in the hash bucket array can be said to be very uniform. In this case, the capacity expansion mechanism is the main influence on the performance of the HashMap. It can be seen from the log that different initialization sizes at this time have little impact on the performance of the HashMap
The initial size is:20, available:4The millisecond initialization size is:200, available:3The millisecond initialization size is:2000, available:4The millisecond initialization size is:20000, available:2msCopy the code
If the hashCode() method of the Key class is fixed to return 100, then every Key object must have hash collisions in the presence of a HashMap
@Override
public int hashCode(a) {
return 100;
}
Copy the code
It can be seen that the time required to store the same amount of data increases geometrically, indicating that a large number of hash conflicts will have a great impact on the HashMap
The initial size is:20, available:2056The millisecond initialization size is:200, available:1902The millisecond initialization size is:2000, available:1892The millisecond initialization size is:20000, available:1865msCopy the code
Second, the LinkedHashMap
A HashMap does not guarantee that the order in which elements are stored and iterated is the same as the order in which they are stored, that is, the HashMap itself is unordered. To solve this problem, Java provides LinkedHashMap to implement an ordered HashMap
Class declaration
LinkedHashMap, a subclass of HashMap, preserves the insertion order of elements. It maintains a linked list of elements by insertion order or access order, by default, as with ArrayList. The LRUcache caching algorithm can be implemented by ordering elements in the order in which they are accessed, which moves the elements to the end of the list after each access
public class LinkedHashMap<K.V> extends HashMap<K.V>
implements Map<K.V>
Copy the code
2. Node classes
Each stored key-value pair in a HashMap is wrapped as a Node object, while a LinkedHashMap is wrapped as an Entry object, as you can see from the newNode method. The Entry class extends the Node class with two new member variables: before and After, which are key to the LinkedHashMap’s ordered access. Each time a new key-value pair is saved, Entry retains the order of the key-value pair by concatenating it with the previous key-value pair via these two variables as the end of the linked list
Whether Entry uses a linked list or a red-black tree to resolve hash conflicts within the HashMap, the pointing of both variables is unaffected by changes in the data structure. This also points to a clever design aspect of the collection framework: instead of creating a new LinkedHashMap to maintain the insertion order of elements, it extends its functionality by extending its parent class
static class Entry<K.V> extends HashMap.Node<K.V> {
// use this to specify before and after
Entry<K,V> before, after;
Entry(int hash, K key, V value, Node<K,V> next) {
super(hash, key, value, next); }}Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
LinkedHashMap.Entry<K,V> p = new LinkedHashMap.Entry<K,V>(hash, key, value, e);
linkNodeLast(p);
return p;
}
/** * The head (eldest) of the doubly linked list. */
transient LinkedHashMap.Entry<K,V> head;
/** * The tail (youngest) of the doubly linked list. */
transient LinkedHashMap.Entry<K,V> tail;
// link at the end of list
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
LinkedHashMap.Entry<K,V> last = tail;
tail = p;
if (last == null)
head = p;
else{ p.before = last; last.after = p; }}Copy the code
3, variables,
The accessOrder variable is used to determine how the elements in the LinkedHashMap are sorted, by the order in which they were accessed if true, or by the order in which they were inserted if false
// serialize the ID
private static final long serialVersionUID = 3801124242820219131L;
// point to the head of the bidirectional list
transient LinkedHashMap.Entry<K,V> head;
// points to the most recently accessed node
transient LinkedHashMap.Entry<K,V> tail;
final boolean accessOrder;
Copy the code
constructor
By default, linkedhashMaps are sorted by element insertion order
public LinkedHashMap(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor);
accessOrder = false;
}
public LinkedHashMap(int initialCapacity) {
super(initialCapacity);
accessOrder = false;
}
public LinkedHashMap(a) {
super(a); accessOrder =false;
}
public LinkedHashMap(Map<? extends K, ? extends V> m) {
super(a); accessOrder =false;
putMapEntries(m, false);
}
public LinkedHashMap(int initialCapacity, float loadFactor, boolean accessOrder) {
super(initialCapacity, loadFactor);
this.accessOrder = accessOrder;
}
Copy the code
5, the method of reservation
There are three empty methods reserved for HashMap, and the source comments indicate that these three functions are reserved for LinkedHashMap
// Callbacks to allow LinkedHashMap post-actions
void afterNodeAccess(Node<K,V> p) {}void afterNodeInsertion(boolean evict) {}void afterNodeRemoval(Node<K,V> p) {}Copy the code
AfterNodeAccess is called when a node in the HashMap is accessed (for example, when the GET method is called) and accessOrder is true. AfterNodeAccess is used to move the newly accessed key-value pair to the end of the list. This operation is quite lightweight, since it only takes a few references to change a node’s position in a linked list
public V get(Object key) {
Node<K,V> e;
if ((e = getNode(hash(key), key)) == null)
return null;
if (accessOrder)
afterNodeAccess(e);
return e.value;
}
// called when node e is accessed
// set node e to the end of the list
void afterNodeAccess(Node<K,V> e) {
// Last refers to the end of the list
LinkedHashMap.Entry<K,V> last;
The next step is required only if last and e are not equal. If they are equal, e is already at the end of the list
if(accessOrder && (last = tail) ! = e) { LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;// After is null because p will be the tail
p.after = null;
// if b == null, p is now the head of the list, and a becomes the new head
// If b! = null, remove node B's reference to node P and concatenate node A
if (b == null)
head = a;
else
b.after = a;
// If a! = null, indicating that node P is not the end node of the linked list at this time, remove node A's reference to node P and connect it with b in series
// If a == null, p is now the end of the list, and a becomes the new end
if(a ! =null)
a.before = b;
else
last = b;
// If last == null, the list is null, and the head node is p
// If last! = null, p becomes the new tail
if (last == null)
head = p;
else {
p.before = last;
last.after = p;
}
// The last node referenced is tailtail = p; ++modCount; }}Copy the code
When the PUT method is called, the afterNodeInsertion method, which determines whether to remove the least recently used element, is called to build the LRUcache cache
This method can be used to remove the least recently used element from the LRUcache algorithm
void afterNodeInsertion(boolean evict) {
LinkedHashMap.Entry<K,V> first;
if(evict && (first = head) ! =null && removeEldestEntry(first)) {
K key = first.key;
removeNode(hash(key), key, null.false.true); }}This method is used to determine whether to remove the oldest cache. The default return is false
// This method can be overridden to remove old data according to specific rules
protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
return false;
}
Copy the code
When a node is removed from the HashMap, the LinkedHashMap removes the reference to the node from the maintained list by visting deremoval
// called after node e is removed
void afterNodeRemoval(Node<K,V> e) {
LinkedHashMap.Entry<K,V> p = (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
// remove the reference of node P to neighboring nodes
p.before = p.after = null;
// if b == null, p is the head of the list, and a will be the new head
// If b! = null, the reference between nodes is updated
if (b == null)
head = a;
else
b.after = a;
// If a == null, node A is the tail node, and the last node accessed after node P is removed is the second-to-last node
// If a! = null, the reference between nodes is updated
if (a == null)
tail = b;
else
a.before = b;
}
Copy the code
6, LRUCache
In Android application development, LRUCache algorithm (least recently used algorithm) is very common. A typical use is to cache Bitmap in memory, because reading Bitmap from IO stream consumes a lot of resources. In order to prevent multiple reading of an image from disk, So it’s usually Bitmap in memory. However, memory space is also limited, so it is not possible to cache every image, and a certain number of images need to be selectively cached. LRUCache is one of the most common cache schemes
This takes advantage of the LinkedHashMap’s ability to arrange elements in the order in which they are used to implement a cache of LRUCache policies
public class LRUCache {
private static class LRUCacheMap<K.V> extends LinkedHashMap<K.V> {
// Maximum number of caches
private final int maxCacheSize;
public LRUCacheMap(int maxCacheSize) {
super(16.0.75 F.true);
this.maxCacheSize = maxCacheSize;
}
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
returnsize() > maxCacheSize; }}public static void main(String[] args) {
// The maximum number of buffers is 5
LRUCacheMap<String, Integer> map = new LRUCacheMap<>(5);
map.put("Java".1);
map.put("Jetpack".2);
map.put("Kotlin".3);
map.put("Yip Chi Chan".4);
map.put("Byte array".5);
map.put("leaveC".6);
System.out.println();
Set<String> keySet = map.keySet();
// The output is: Jetpack Kotlin leaveC
keySet.forEach(key -> System.out.print(key + ""));
// Get the value of the head node of the list and move it to the end of the list
map.get("Jetpack");
System.out.println();
keySet = map.keySet();
// The output is: Kotlin leaveC Jetpack
keySet.forEach(key -> System.out.print(key + ""));
// Add elements to the list
map.put("Dart".5);
System.out.println();
LeaveC Jetpack Dart leaveC Jetpack Dart
keySet.forEach(key -> System.out.print(key + "")); }}Copy the code
Third, HashSet
HashSet implements the Set interface, does not allow the insertion of duplicate elements, allows the inclusion of null elements, and does not guarantee the iteration order of elements. Looking at HashSet after the above parsing of the HashMap source code has a “just so” feel to it
As we know, when inserting a key-value pair with the same key into a HashMap, the old key in the HashMap will not be changed, but the old value may be overwritten by the new value. The HashSet relies on this feature to achieve its own unrepeatability. A HashSet contains a HashMap, and any value added to the HashSet is wrapped as a key-value pair and stored in the HashMap. The key is the external value, and the value is provided by the HashSet. If the key is not repeated, it is saved normally. When the key is repeated, only the value is changed, thus implementing the non-repeating feature of HashSet elements
Post the source code directly here
public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable.java.io.Serializable{
static final long serialVersionUID = -5024744406713321676L;
// Use HashMap to store data
// The Key Value is passed in externally, and the Value is maintained inside the HashSet
private transient HashMap<E,Object> map;
// All key-value pairs in the HashMap share the same value
// All key-value pairs stored in the HashMap use this object as their value
private static final Object PRESENT = new Object();
public HashSet(a) {
map = new HashMap<>();
}
// Use the default load factor to calculate the initial size of the HashMap
//+1 is used to compensate for the loss of accuracy
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1.16));
addAll(c);
}
public HashSet(int initialCapacity, float loadFactor) {
map = new HashMap<>(initialCapacity, loadFactor);
}
public HashSet(int initialCapacity) {
map = new HashMap<>(initialCapacity);
}
// This constructor is package access and is only used to support LinkedHashSet
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<>(initialCapacity, loadFactor);
}
// Convert an iteration over a HashSet to an iteration over a HashMap Key
public Iterator<E> iterator(a) {
return map.keySet().iterator();
}
public int size(a) {
return map.size();
}
public boolean isEmpty(a) {
return map.isEmpty();
}
public boolean contains(Object o) {
return map.containsKey(o);
}
// If the HashMap does not contain a key-value pair whose key is e, add the element and return true
// If it does, only value is overwritten without affecting key, and false is returned
// To implement the non-repeating feature of HashSet keys
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
public boolean remove(Object o) {
return map.remove(o)==PRESENT;
}
public void clear(a) { map.clear(); }}Copy the code
Four, LinkedHashSet
LinkedHashSet, whose internal source code is simple to a few dozen lines, is a subclass of HashSet, as you can guess from its name, and relies on lists to implement ordered Hashsets
HashSet reserves a constructor for LinkedHashSet, whose dummy parameter has no real meaning, just to distinguish it from other constructors. Other constructors initialize the map variable to be of type HashMap. Reserved constructors initialize the LinkedHashMap variable to be of type HashMap, so that the LinkedHashSet can be accessed by the two-way list inside the LinkedHashMap. Element unique properties
public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable.java.io.Serializable {
private transient HashMap<E,Object> map;
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = newLinkedHashMap<>(initialCapacity, loadFactor); }}public class LinkedHashSet<E> extends HashSet<E> implements Set<E>, Cloneable.java.io.Serializable {
private static final long serialVersionUID = -2851667679971038690L;
public LinkedHashSet(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor, true);
}
public LinkedHashSet(int initialCapacity) {
super(initialCapacity, .75f.true);
}
// Use the default initial capacity and load factor
public LinkedHashSet(a) {
super(16..75f.true);
}
public LinkedHashSet(Collection<? extends E> c) {
super(Math.max(2*c.size(), 11), .75f.true);
addAll(c);
}
@Override
public Spliterator<E> spliterator(a) {
return Spliterators.spliterator(this, Spliterator.DISTINCT | Spliterator.ORDERED); }}Copy the code