preface
HashMap is a common interview question. In this article, we will read the source code to understand its design principles and the following questions
- Implementation principle of HashMap
- Why is the initial capacity a multiple of 2
- How to
resize
- Thread safe or not
Commonly used parameters
// Maximum capacity is 2 to the 30th
static final int MAXIMUM_CAPACITY = 1 << 30;
// This parameter is triggered only when the initial capacity is 16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
// Default load factor
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
// hash table, store linked list. The length is 2 to the N, or 0 when initialized.
transient Node<K,V>[] table;
// Load factor, used to calculate the threshold of the number of hash table elements. Threshold = hash table. Length * loadFactor;
final float loadFactor;
// Threshold for the number of elements in the hash table. When the number of elements in the hash table exceeds the threshold, resize() occurs.
int threshold;
Copy the code
Table is called a hash table and is used to store linked list nodes
The Node list
static class Node<K.V> implements Map.Entry<K.V> {
final int hash;
final K key;
V value;
Node<K,V> next;
Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}
public final K getKey(a) { return key; }
public final V getValue(a) { return value; }
public final String toString(a) { return key + "=" + value; }
public final int hashCode(a) {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceofMap.Entry) { Map.Entry<? ,? > e = (Map.Entry<? ,? >)o;if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
return false; }}Copy the code
It can be seen that the data structure of Node is Node nested Node, so it is called linked list. A Node in the entire list is called a Node.
A constructor
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
Copy the code
public HashMap(int initialCapacity) {
// Specify a constructor that initializes the capacity
this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
Copy the code
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
// Set the threshold to initialize the capacity to the 2 n power
this.threshold = tableSizeFor(initialCapacity);
}
Copy the code
public HashMap(Map<? extends K, ? extends V> m) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
// Add the elements in m to the new hash table
putMapEntries(m, false);
}
Copy the code
capacityresize
final Node<K,V>[] resize() {
Node<K,V>[] oldTab = table;
int oldCap = (oldTab == null)?0 : oldTab.length;
int oldThr = threshold;
int newCap, newThr = 0;
if (oldCap > 0) {
// The capacity cannot be expanded
if (oldCap >= MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return oldTab;
}
else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
oldCap >= DEFAULT_INITIAL_CAPACITY)
// If the old capacity is greater than or equal to the default initial capacity 16
// The new capacity will be twice the old capacity
newThr = oldThr << 1; // double threshold
}
else if (oldThr > 0) // initial capacity was placed in threshold
newCap = oldThr;
else { // zero initial threshold signifies using defaults
// If the old hash table is 0, the new hash table is the default
newCap = DEFAULT_INITIAL_CAPACITY;
newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
}
if (newThr == 0) {
// The new threshold is 0 recalculated
float ft = (float)newCap * loadFactor;
newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
(int)ft : Integer.MAX_VALUE);
}
threshold = newThr;
@SuppressWarnings({"rawtypes"."unchecked"})
Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
table = newTab;
if(oldTab ! =null) {
// Iterate over the old hash table
for (int j = 0; j < oldCap; ++j) {
Node<K,V> e;
if((e = oldTab[j]) ! =null) {
oldTab[j] = null;
if (e.next == null)
// Place just one element in the new hash table at e.hash & (newcap-1)
newTab[e.hash & (newCap - 1)] = e;
else if (e instanceof TreeNode)
((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
else { // preserve order
// Since capacity expansion doubles capacity, each linked list in the original hash table may now be stored in the original index, i.e., low bit,
// Or the index after expansion, that is, high bit. High bit = Low bit + Original hash bucket capacity
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
elsehiTail.next = e; hiTail = e; }}while((e = next) ! =null);
if(loTail ! =null) {
loTail.next = null;
newTab[j] = loHead;
}
if(hiTail ! =null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
}
}
}
}
return newTab;
}
Copy the code
Simply put, if the hash table is empty, it is allocated to the original capacity, otherwise it is doubled, and the elements of the original hash table may be in the same position as the new hash table, or they may be in index+oldCap position
Increase and changeput(K key, V value)
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
Copy the code
- 1. Calculate the hash value
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
Hash algorithms, which are not the focus of this article, will be skipped here
- 2. Insert or overwrite values if
table
The first(n - 1) & hash
Is empty, insert a new linked list at that position; If the value is not null, the value is overwritten directly if the key value is equal.If it is a red-black tree, insert; Otherwise, append a new node to the end of the list.
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
// If the current table is null, the table is initialized
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// If the I ((n-1) &hash) is null, insert the new node directly
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
// If the hash values are equal and the key is equal, the node to be overridden is returned
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
/ / the red-black tree
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
// traverse to the tail, append a new node to the tail
p.next = newNode(hash, key, value, null);
// If the number of linked lists is greater than or equal to 8 after the addition of nodes, it is converted to a red-black tree
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
// If the node to overwrite is found
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break; p = e; }}// There are nodes that need to be overwritten
if(e ! =null) { // existing mapping for key
V oldValue = e.value;
if(! onlyIfAbsent || oldValue ==null)
e.value = value;
/ / empty
afterNodeAccess(e);
// Return the old value
return oldValue;
}
}
++modCount;
// Update the size and determine whether expansion is required. Size is the size of the map
if (++size > threshold)
resize();
/ / empty
afterNodeInsertion(evict);
return null;
}
Copy the code
The flow chart is as follows
checkget(Object key)
public V get(Object key) {
Node<K,V> e;
return (e = getNode(hash(key), key)) == null ? null : e.value;
}
Copy the code
-
1. Calculate the hash value
-
2. Check the (n-1) and hash positions of the table. For example, if keys are equal, return this object. If it is a red-black tree, the object is returned. Otherwise, loop back through the node until an equal key or empty node is found.
final Node<K,V> getNode(int hash, Object key) {
Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
if((tab = table) ! =null && (n = tab.length) > 0 &&
(first = tab[(n - 1) & hash]) ! =null) {
if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
// If the key is equal, this object is returned
return first;
if((e = first.next) ! =null) {
/ / the red-black tree
if (first instanceof TreeNode)
return ((TreeNode<K,V>)first).getTreeNode(hash, key);
// Loop through this node until an equal key or empty node is found
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
return e;
} while((e = e.next) ! =null); }}return null;
}
Copy the code
deleteremove(Object key)
- 1. Calculate the hash value
final Node<K,V> removeNode(int hash, Object key, Object value, boolean matchValue, boolean movable)
- 2. The judge
table
The first(n - 1) & hash
The case of the location node
If matchValue is true, then the movable parameter must have both key and value equal. When deleting nodes, do not move any other nodes
final Node<K,V> removeNode(int hash, Object key, Object value,
boolean matchValue, boolean movable) {
Node<K,V>[] tab; Node<K,V> p; int n, index;
if((tab = table) ! =null && (n = tab.length) > 0 &&
(p = tab[index = (n - 1) & hash]) ! =null) {
Node<K,V> node = null, e; K k; V v;
if(p.hash == hash && ((k = p.key) == key || (key ! =null && key.equals(k))))
// If the key is equal, this object is returned
node = p;
else if((e = p.next) ! =null) {
if (p instanceof TreeNode)
/ / the red-black tree
node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
else {
// Loop through this node until an equal key or empty node is found
do {
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k)))) {
node = e;
break;
}
p = e;
} while((e = e.next) ! =null); }}// Nodes to be deleted and matchValue is false or equal
if(node ! =null&& (! matchValue || (v = node.value) == value || (value ! =null && value.equals(v)))) {
if (node instanceof TreeNode)
((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
else if (node == p)
// The list header is the list to be deleted
tab[index] = node.next;
else
// The node to be deleted is in the middle of the table
p.next = node.next;
++modCount;
/ / correct size
--size;
afterNodeRemoval(node);
returnnode; }}return null;
}
Copy the code
If the key is equal, this object is returned. If it is a red-black tree, the linked list is returned; Otherwise, loop back through the node until the node is found, then delete it.
conclusion
Inside a HashMap is an array of nodes, namely, Node
[] table, which is called a hash table. The nodes stored in the table are linked lists, and each Node in the linked list is an element we put into the HashMap.
- Thread safe or not
The thread is not safe and the access process does not have any locks.
- Why expansion
Because its data structure is an array, it involves capacity expansion.
- How to increase
When the capacity of HashMap exceeds threshold(Length * capacity expansion factor), capacity expansion is triggered. If the linked list is empty, it will be allocated to the initial capacity; otherwise, the capacity will be doubled. The nodes of the original linked list may be in the same position of the new linked list, or in the index+oldCap position. Before and after the expansion, the length of the hash table must be a multiple of 2.
- Why hash tables (
table
) The capacity is a multiple of 2
When HashMap is accessed, index (length-1) & hash is computed, using the & operator (which is more efficient than %). If length is a multiple of 2, the index is evenly divided to the maximum extent possible
- Why do you need it? Do you need it?
&
Bit operations (mod%
)
HashCode () is an int with a range of over 4 billion values. As long as the hash function mapping is fairly uniform and loose, collisions are minimal. However, since the length of a HashMap hash table is much smaller than the hash value range (the default value is 16), when the hash value is mod to the length of the table to find the subscript that stores the key, the high hash value is ignored because the mod is done by the ampersand operation (&). Therefore, only the lower part of hashCode() will participate in the operation, and the probability of different hash values will occur, but the same index will be obtained is greatly increased, which is called hash collision.
- How do I resolve hash collisions
The hash function (Object key) is designed to resolve hash collisions. It combines the features of the high and low hash values and puts them in the low one. Therefore, when it is used with the high and low bits, it is equivalent to participating in the operation together to reduce the probability of hash collisions.
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
HashMap
To optimize the
1. Use an appropriate initial capacity to reduce the loss caused by resize.
2. Use appropriate keys, such as String and Integer, to reduce hash collisions and maximize storage and retrieval efficiency.
Give it a thumbs up if you think it helps. Thank you
HashMap (JDK8)