Abstract
The principle of HashMap is also a common question in the interview of large factories, and also a Java container commonly used in work. This article mainly analyzes and explains the following questions to help you understand the principle of HashMap.
1. What is the process for adding a key-value pair to a HashMap?
2. Why is HashMap not thread safe?
3. Why rewrite the hashCode() and equal() methods together?
What is the process for adding a key-value pair to a HashMap?
This is a flow chart found on the Internet. You can look at this flow chart with steps to understand the process of adding key-value pairs.
1. Initialize the table
Check whether the table is empty or null, otherwise execute the resize() method (resize is usually called during capacity expansion, but can also be called to initialize the table).
2. Calculate the hash value
Computes the hash value based on the key value. (Since hashCode is a variable of type int, which is 4 bytes and 32 bits, an xOR operation is performed on the lower 16 bits of hashCode and the higher 16 bits of hashCode to preserve the higher features, so that the resulting hash values are more evenly distributed.)
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
3. Insert or update the node
According to (n-1) & hash calculation to get the inserted array subscript I, and then judge
table[i]==null
If there are no hash conflict elements in the current array subscript, create a new node to add them.
table[i].hash == hash &&(table[i]== key || (key ! = null && key.equals(table[i].key)))
Check whether the first element of table[I] is the same as the key.
table[i] instanceof TreeNode
Check whether table[I] is a treeNode, that is, whether table[I] is a red-black tree. If table[I] is a red-black tree, insert key/value pairs directly into the tree.
Other situations
If the above criteria are not met, it indicates that table[I] stores a linked list. Then the list is traversed to determine whether the key of the existing element is equal to the key of the inserted key-value pair. If so, the value is updated; if not, a new node is inserted at the end of the linked list. After insertion, determine whether the length of the list is greater than 8. If so, convert the list to a red-black tree.
Expansion and 4.
After the insert is successful, check whether the actual number of key-value pairs size exceeds the maximum capacity threshold(generally, array length x load factor 0.75). If so, expand the capacity.
Source code is as follows:
2. Why is HashMap not thread safe?
In addition to adding key-value pairs to a HashMap, we can see that there is no lock in the entire method, so if multiple lines are accessed concurrently, it can cause data inconsistency problems.
Such as:
If there are two key/value pair thread execution is added to the the if (= = null (TAB = table) | | (n = TAB. Length) = = 0) this line statements, are carried out on the table variable array initialization, can cause array has been initialized good table are covered, The previously initialized thread then adds the key-value pair to the previously initialized array, causing the key-value pair to be lost.
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
// if TAB is empty, it will be created
if ((tab = table) == null || (n = tab.length) == 0) n = (tab = resize()).length; . The following code is omitted}Copy the code
3. Why rewrite the hashCode() and equal() methods together?
Whenever our object is used as a key in a HashMap, or as an element in a HashSet, we must override both the hashCode() and equal() methods
First look at the default implementations of the hashCode() and equal() methods
As you can see from the source code in the Obejct class below, you can see that the default implementation of equals() is to determine whether the memory address of two objects is the same to return the result.
public native int hashCode();
public boolean equals(Object obj) {
return (this == obj);
}
Copy the code
The default implementation of hashCode is to return a memory address. In the case of the OpenJDK, there are five methods for calculating hashCode by default. One method returns a random number, the other returns a memory address.
Interested friends can look at this blog blog.csdn.net/xusiwei1236…
Then look at the use of the hashCode() method, the equal() method, in a HashMap
static final int hash(Object key) {
int h;
// Since hashCode is a variable of type int, it is 4 bytes and 32 bits, an xOR operation will be performed on the lower 16 bits of hashCode and the higher 16 bits to preserve the higher features, so that the resulting hash values will be more evenly distributed
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
Copy the code
To store a set of key-value pairs evenly in an array, a HashMap computs the hashCode of the key to obtain a hash value, modulates the length of the array with the hash to obtain the array subscript, and stores the key-value pairs in the list corresponding to the array subscript (assuming that the list length is less than 8 and does not reach the threshold for converting to a red-black tree).
Here is the putVal() method to add key-value pairs, which executes when the array subscript corresponds to a linked list
// Iterate over the list
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {// The key has been traversed to the end of the list
p.next = newNode(hash, key, value, null);// Add the key-value pair at the end
if (binCount >= TREEIFY_THRESHOLD - 1) // The list length exceeds the red black tree threshold (also fast list length =8)
treeifyBin(tab, hash);
break;
}
if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
break;
p = e;
}
Copy the code
Can clearly see the judgment to add the key and the chain is the key has been in existence in the table the same way mainly e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)), i.e. : 1. Check whether the hash values are equal first. If the hash values are not equal, the check ends directly, because the keys must be different. 2. Check whether the memory addresses of the two key objects are the same. 3. If the key is not null, call the equal() method on the key to determine whether it is equal. It is possible that two keys may have different addresses in memory, but they are equal. like
String a = new String("test");
String b = new String("test");
System.out.println("a==b is "+a==b); / / for printingfalse
System.out.println("a.equals(b) is "+a.equals(b)); / / for printingtrue
Copy the code
background
Suppose we have a KeyObject class, and suppose we consider the property A of two KeyObjects to be equal, then the KeyObject is equal. We use the KeyObject as the key of a HashMap. You cannot repeatedly add values of equal keyObjects and unequal values to a HashMap
public static class KeyObject { Integer a; public KeyObject(Integer a) { this.a = a; }}Copy the code
Assuming that both hashCode() and equals() methods are not overridden (result: HashMap does not guarantee de-weighting)
Execute the following code:
public static void main(String[] args) {
KeyObject key1 = new KeyObject(1);
KeyObject key2 = new KeyObject(1);
System.out.println("The hashCode of key1 is+ key1.hashCode());
System.out.println("The hashCode for key2 is + key2.hashCode());
System.out.println("Key1.equals (key2) equals"+(key1.equals(key2)));
HashMap<KeyObject,String> hashMap = new HashMap<KeyObject,String>();
hashMap.put(key1,"value1");
hashMap.put(key2,"value2"); / / printhashMap
for(KeyObject key :hashMap.keySet()){
System.out.println("KeyObject.a="+key.a+":"+hashMap.get(key)); }}Copy the code
If the hashCode() and equals() methods of the KeyObject are not overridden, then the hashcodes of key1 and key2 are not the same, and the calls to equals() of key1 and key2 are not equal, even if the property a of the KeyObject is 1. This allows both key1 and key2 to exist in the hashMap.
Print result:
Key1 hashCode is728890494Key2 hashCode is1558600329Key1.equals (key2) results infalse
KeyObject.a=1 : value1
KeyObject.a=1 : value2
Copy the code
If you just overwrite the hashCode() method (result: cannot correctly evaluate the equality of the linked list elements and therefore cannot guarantee de-duplication)
Execute the following code:
public static class KeyObject {
Integer a;
public KeyObject(Integer a) {
this.a = a;
}
@Override
public int hashCode(a) {
return a;
}
public static void main(String[] args) {
KeyObject key1 = new KeyObject(1);
KeyObject key2 = new KeyObject(1);
System.out.println("The hashCode of key1 is+ key1.hashCode());
System.out.println("The hashCode for key2 is + key2.hashCode());
System.out.println("Key1.equals (key2) equals"+(key1.equals(key2)));
HashMap<KeyObject,String> hashMap = new HashMap<KeyObject,String>();
hashMap.put(key1,"value1");
hashMap.put(key2,"value2");
for(KeyObject key :hashMap.keySet()){
System.out.println("TestObject.a="+key.a+":"+hashMap.get(key)); }}}Copy the code
Equal () returns true only when the memory addresses of two objects are equal. Although key1 and key2 have the same a property, they are different objects in memory, so key1==key2 will result in false. The equals() method of KeyObject is implemented by default to determine the memory addresses of two objects, so key1.equals(key2) will also be false, so both key-value pairs can be repeatedly added to the hashMap.
Output result:
Key1 hashCode is1Key2 hashCode is1Key1.equals (key2) results infalse
TestObject.a=1 : value1
TestObject.a=1 : value2
Copy the code
If you just override equals() (result: map to different array subscripts in HashMap, no guarantee of de-weighting)
public static class KeyObject {
Integer a;
public KeyObject(Integer a) {
this.a = a;
}
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null|| getClass() ! = o.getClass())return false;
KeyObject keyObject = (KeyObject) o;
return Objects.equals(a, keyObject.a);
}
public static void main(String[] args) {
KeyObject key1 = new KeyObject(1);
KeyObject key2 = new KeyObject(1);
System.out.println("The hashCode of key1 is+ key1.hashCode());
System.out.println("The hashCode for key2 is + key2.hashCode());
System.out.println("Key1.equals (key2) equals"+(key1.equals(key2)));
HashMap<KeyObject,String> hashMap = new HashMap<KeyObject,String>();
hashMap.put(key1,"value1");
hashMap.put(key2,"value2");
for(KeyObject key :hashMap.keySet()){
System.out.println("TestObject.a="+key.a+":"+hashMap.get(key)); }}}Copy the code
Assuming only equals(), the hashCode method will be the default implementation. The exact calculation method depends on the JVM. Will be stored in linked lists under different array subscripts in the hashMap, and will also result in duplicate elements in the hashMap.
The following output is displayed:
Key1 hashCode is1289479439Key2 hashCode is6738746Key1.equals (key2) results intrue
TestObject.a=1 : value1
TestObject.a=1 : value2
Copy the code
conclusion
So when our object is used as a key in a HashMap, or as an element in a HashSet, we must override both the hashCode() and equal() methods, because hashCode affects the array subscript that the key stores and the initial determination of the list elements. Equal () is used as the final criterion for determining whether a key is equal to a key in a linked list.
- So overwriting the hashCode() method alone would result in an incorrect equality judgment with the linked list elements, and thus no guarantee of de-weighting.
- Overriding only equals() results in key-value pairs being mapped to different array subscripts in the HashMap, with no guarantee of de-duplication.