One, what is a HashMap
In Java, Map is a collection class used to store key-value pairs <key,value>, that is, hash table, and HashMap is the implementation class of Map.
The hash table has the advantages of high storage efficiency and fast query. However, the hash table is not thread synchronous. Keys in a Map cannot be duplicated.
Second, the realization principle
Each array space uses Node<key,value> nodes to store key-value pairs. Under certain conditions, TreeNode<key,value> is used to store data.Copy the code
Three, the introduction of basic attributes
//The default initial capacity - MUST be a power of two
// Default initial capacity - must be a power of 2.
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4
Copy the code
//The maximum capacity, used if a higher value is implicitly specified by either of the constructors with arguments.MUST be a power of two <= 1 < < 30.
If the constructor specifies a larger value, only the maximum value is used, and the size must be a power of 2
static final int MAXIMUM_CAPACITY = 1 << 30
Copy the code
//The load factor used when none specified in constructor
// Load factor used when not specified in constructor (default)
static final float DEFAULT_LOAD_FACTOR = 0.75 f
Copy the code
//The bin count threshold for using a tree rather than list for a bin.
//Bins are converted to trees when adding an element to a bin with at least this many nodes.
//The value must be greater than 2 and should be at least 8 to mesh with assumptions in tree removal about //conversion back to plain bins upon shrinkage.
// Tree threshold: When the number of nodes in an array box is greater than TREEIFY_THRESHOLD, it should be stored in a tree, not a linked list
static final int TREEIFY_THRESHOLD = 8
Copy the code
//The bin count threshold for untreeifying a (split) bin during a resize operation.
//Should be less than TREEIFY_THRESHOLD, and at most 6 to mesh with shrinkage detection under removal.
// If the number of tree nodes is less than 6, the tree is converted to a linked list
static final int UNTREEIFY_THRESHOLD = 6
Copy the code
//The smallest table capacity for which bins may be treeified.
// Minimum threshold for treification: a list can be converted to a tree only if the number of nodes in the hash table is greater than 64
static final int MIN_TREEIFY_CAPACITY = 64
Copy the code
// Hash array of storage nodes
transient Node<K, V>[] table
Copy the code
// The number of nodes in the hash table
transient int size
Copy the code
// Change the number of times HashMap is used for fail-fast
transient int modCount
Copy the code
//The next size value at which to resize (capacity * load factor)
// Next capacity expansion threshold (capacity expansion threshold)
int threshold
Copy the code
//The load factor for the hash table
// Load factor, which is used to adjust the padding of the hash table
final float loadFactor
Copy the code
Constructors
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
this.loadFactor = loadFactor;
// Set the threshold for the next expansion. This is not a real threshold, which will be reset later
this.threshold = tableSizeFor(initialCapacity);
}
Copy the code
Five, the introduction of common methods
/** * Calculates the hash value of the passed value, which is used to calculate the location of the store * Xor the high 16 bits of the hashCode with the low 16 bits. The main reason for doing this is to better scatter the hash distribution and reduce hash collisions * If the use & the calculated might be to be near zero, the results of calculation result of using | will be near to 1, can't do this exclusive or operation of the uniform distribution * reason: to hash code the randomness of the low * String s = "Hello World!" ; * Integer.toBinaryString(s.hashCode()) * h.hashCode() = 1100 0110 0011 1100 1011 0110 0001 1101 * h.hashCode() >>> 16 = 0000 0000 0000 0000 0000 0000 0000 0000 1100 0110 0011 1100 * 1100 0110 0011 1100 1011 0110 0001 1101 ^ * 0000 0000 0000 0000 1100 0110 0011 1100 * 1100 0110 0011 1100 1000 0110 0001 1100 */
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
The size of the hash table is always a power of 2 *. The binary representation of length is always 100... 000, the binary of leng-1 is 1111... 111 * 2. Hash %length = hash&(length-1) when length is equal to a power of 2, but the latter is more efficient * 3. If the length is odd, then length-1 will be even. No matter how to mod or &, the final result will be even, which will cause half of the space waste */
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
Copy the code
Access to the node
final Node<K, V> getNode(int hash, Object key) {
Node<K, V>[] tab;
Node<K, V> first, e;
int n;
K k;
if((tab = table) ! =null && (n = tab.length) > 0 && (first = tab[(n - 1) & hash]) ! =null) {
// Always check to see if the first node is the desired node
if(first.hash == hash && ((k = first.key) == key || (key ! =null && key.equals(k))))
return first;
if((e = first.next) ! =null) {
// If the first node is a tree, you need to look in the red-black tree
if (first instanceof TreeNode)
return ((TreeNode<K, V>) first).getTreeNode(hash, key);
do {
// Find the desired node in the list
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
return e;
// Move behind the list
} while((e = e.next) ! =null); }}return null;
}
// Find the node
final TreeNode<K, V> getTreeNode(int h, Object k) {
return((parent ! =null)? root() :this).find(h, k, null);
}
// Get the root node of the tree
final TreeNode<K, V> root(a) {
for (TreeNode<K, V> r = this, p; ;) {// Only the parent of the root node is empty
if ((p = r.parent) == null)
return r;
// Move to the parent noder = p; }}// Look in the tree
final TreeNode<K, V> find(inth, Object k, Class<? > kc) {
TreeNode<K, V> p = this; // Copy the current node
do {
int ph, dir; // Hash value of the current node, direction
K pk; // Key of the current node
TreeNode<K, V> pl = p.left, pr = p.right, q; // The left child, right child, and q of the current node to return the found result
// The hash value of the current node is greater than the hash value of the node to find, so move to the left subtree
if ((ph = p.hash) > h)
p = pl;
// The hash value of the current node is less than the hash value of the node to be found, so move to the right subtree
else if (ph < h)
p = pr;
// The hash value and key of the current node are the same
else if((pk = p.key) == k || (k ! =null && k.equals(pk)))
return p;
// If the left subtree is empty, move to the right subtree
else if (pl == null)
p = pr;
// Right subtree is empty, left subtree move judgment
else if (pr == null)
p = pl;
Comparable compares the size of the key of the current traversal node p with that of the key to be searched
else if((kc ! =null|| (kc = comparableClassFor(k)) ! =null) && (dir = compareComparables(kc, k, pk)) ! =0)
p = (dir < 0)? pl : pr;< 0, iterate left; > 0, iterate right
// This step indicates that we cannot compare the size with Comparable, so we continue the recursive search from the right child
else if((q = pr.find(h, k, kc)) ! =null)
return q;
else
// Do not find it after iterating from the right child, continue to search from the left child
p = pl;
} while(p ! =null);
return null;
}
static int compareComparables(Class
kc, Object k, Object x) {
return (x == null|| x.getClass() ! = kc ?0 : ((Comparable) k).compareTo(x));
}
Copy the code
Insert the node
final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
Node<K, V>[] tab;
Node<K, V> p;
int n, i;
/* * 1, if the hash table is empty, it is created by resize(). So, the time to initialize the list is the first time the put function is called */
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
* If no, create nodes at the array location and insert *; otherwise, hash conflict exists. That is, nodes at the current storage location already exist. B. Determine whether the data structure to be inserted is a red-black tree or linked list */
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K, V> e; // The node to be modified
K k;
/* * a. Check whether the key of the table[I] element is the same as that of the key to be inserted. If so, overwrite the old value */ with the new value
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
/* * b. Check whether the key-value pair to be inserted is a red-black tree. If it is a red-black tree, insert or update the key-value pair */
else if (p instanceof TreeNode)
e = ((TreeNode<K, V>) p).putTreeVal(this, tab, hash, key, value);
/* * c. Update the key-value pairs * I in the list. Check whether there is a key in the table[I]. Check whether there is a key in the table[I]. Check whether there is a key in the table[I]. After the new node is added, you need to determine whether the length of the linked list is greater than 8. If so, you need to tree */
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
// Insert to the end of the list
p.next = newNode(hash, key, value, null);
// If the length of the current list is greater than 8, then you need to tree it
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
// If a node with the same key is found, exit
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break;
// Move the node backward to traversep = e; }}/* * Replace the old node with the new value */
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
// The default implementation is empty, in order to implement the ordered map for LinkedHashMap
afterNodeAccess(e);
returnoldValue; }}// Change times +1, used for fail-fast
++modCount;
// The number of nodes exceeds the threshold and needs to be expanded
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
Copy the code
final Node<K, V>[] resize() {
// Hash table before expansion
Node<K, V>[] oldTab = table;
// The old hash table length
int oldCap = (oldTab == null)?0 : oldTab.length;
// Hash table capacity expansion threshold before capacity expansion
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
// If the old capacity is greater than the maximum, return the old hash table directly
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
} else if
// If the old capacity is less than half of the maximum capacity and the hash table is already initialized, the new threshold is twice the old threshold
((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
newThr = oldThr << 1; // double threshold
} else if (oldThr > 0)
// If the old threshold is greater than 0, the hash table has been initialized, and the new capacity equals the old threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
/ / initialization
newCap = DEFAULT_INITIAL_CAPACITY;
// The new threshold is equal to the initial capacity * load factor
newThr = (int) (DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
// If the new threshold is 0
if (newThr == 0) {
// Calculate the new threshold
float ft = (float) newCap * loadFactor;
// If the new capacity is greater than the maximum and the new threshold is greater than the maximum, the new threshold is the maximum
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float) MAXIMUM_CAPACITY ? (int) ft : Integer.MAX_VALUE);
}
// Assign a new threshold
threshold = newThr;
// Create a new hash table and assign a new hash table
@SuppressWarnings({"all"})
Node<K, V>[] newTab = (Node<K, V>[]) new Node[newCap];
table = newTab;
if(oldTab ! =null) {
// Move each bucket into the new bucket
for (int j = 0; j < oldCap; ++j) {
Node<K, V> e;
if((e = oldTab[j]) ! =null) {
// The old node is empty
oldTab[j] = null;
// The current node does not have a rear node
if (e.next == null)
newTab[e.hash & (newCap - 1)] = e;
else if
// The current node is a tree node, need to insert the tree node and adjust the tree balance
(e instanceof TreeNode)
((TreeNode<K, V>) e).split(this, newTab, j, oldCap);
else {
// Process linked list nodes
Node<K, V> loHead = null, loTail = null;
Node<K, V> hiHead = null, hiTail = null;
Node<K, V> next;
do {
next = e.next;
// The original position does not need to be moved
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
/ / stern interpolation
loTail.next = e;
// The tail node moves backwards
loTail = e;
} else {
// The insertion position needs to be recalculated
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
// The tail node moves backwardshiTail = e; }}while((e = next) ! =null);
// The original index is placed in the bucket
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
// Add oldCap to bucket
if(hiTail ! =null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
Copy the code
Five, compare with previous versions
1. Hash value calculation method
/ / 1.7
static final int hash(int h) {
h ^= k.hashCode();
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
/ / 1.8
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
It can be seen from the above that 1.7 has done 5 xOR operations and 4 unsigned right shift operations in total, while 1.8 has done 1 right shift and 1 XOR operations in total, which is more efficient than 1.8
2. Different data structures are stored
1.7 Use array + linked list to solve hash conflicts, and the linked list uses the header interpolation method
1.8 Use array + linked list + red-black tree to solve hash conflicts, linked list uses tail insertion method
The 1.7 data structure also creates a circular linked list
Conditions for the occurrence of problems:
When multiple threads perform put at the same time, if the resize operation is called at the same time, the circular linked list may be generated, which will lead to an infinite loop during the later GET
Causes of the problem:
When you do a put in multiple threads, you’re supposed to move the next pointer of the current node to an empty position in the new array, and the next pointer of the current node points to the node you just put in, thus creating a circular linked list.
3. Capacity expansion mechanisms are different
1.7 Recalculate the insertion position of each original node
1.8 Only the nodes in the first position do not need to recalculate the insertion position, the other nodes are the current position + the old array capacity