I. Map collection of containers
In the source code of collection system, the design of HashMap in Map is the most classic, involving data structure, programming ideas, hash calculation and so on. It is necessary to refer to some ideas of source code in daily development.
- Basic: add and delete elements, container information;
- Advanced: storage structure, capacity, hash;
The API system
In the whole Map and Set API architecture, the most important is the implementation principle of HashMap:
- HashMap: Manages elements based on hash tables;
- LinkedHashMap: Based on HashMap and bidirectional linked lists;
- HashSet: The underlying maintenance of the HashMap structure;
- LinkedHashSet: inherited from HashSet, bidirectional linked list;
So the Map and Set families, with the exception of special apis, rely on HashMap for basic principles, but apply different elements management characteristics to their specific implementations.
Second, data structure
Before looking at HashMap, let’s understand one data structure: the array + linked list structure.
Based on the location of array management elements, the storage of elements forms a linked list structure. Since it is a linked list, it can be a one-way and two-way structure, which needs to be analyzed for specific API. Through this structure, several key information can be obtained:
- Expansion: based on the array, it faces the expansion problem;
- 1. The mechanism for forming a linked list structure;
- Hash: hash value calculation and conflict handling;
3. Explain HashMap
1. Structure encapsulation
Array + linked list (array + linked list)
transient Node<K,V>[] table;
Copy the code
The variable in the array structure in the HashMap is named table and is based on Node
:
static class Node<K.V> implements Map.Entry<K.V> {
final int hash;
final K key;
V value;
Node<K,V> next;
}
Copy the code
Implement the Map.Entry interface, and define the structure variables of the node, and related methods of the node itself.
2. Construction method
Once you know the infrastructure in a HashMap, you can look at its associated constructors and initialize which variables:
No arguments structure
public HashMap(a) {
this.loadFactor = DEFAULT_LOAD_FACTOR;
}
Copy the code
- Float DEFAULT_LOAD_FACTOR = 0.75 f;
- this.loadFactor = DEFAULT_LOAD_FACTOR;
There’s actually one core parameter to focus on:
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; / / 16Copy the code
That is, the default array initial capacity DEFAULT_INITIAL_CAPACITY is 16, and the capacity expansion threshold loadFactor is 0.75, which means that the array will be expanded when the number of elements in the array reaches 12.
Have the cords structure
Of course, you can also set two parameters by using the parameter constructor: capacity and capacity expansion threshold:
public HashMap(int initialCapacity, float loadFactor) ;
Copy the code
The source code for the two constructors shows that when a new HashMap is created directly, the hash array is not initialized immediately, but the key variables can be customized.
3. Load elements
Follow the way HashMap is used to add elements:
public V put(K key, V value) {
return putVal(hash(key), key, value, false.true);
}
Copy the code
Instead of doing too much direct action when putting, two key methods are called:
- Hash () : Computes the hash value of the key.
- PutVal () : element addition process;
Here we must look at a key method, the calculation of hash values:
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
Instead of directly obtaining the return value of hashCode in Object, calculate the hashCode value corresponding to the key and the value of the hashCode value moved 16 bits to the right, and perform xOR operation on the two results to reduce the probability of hash conflict.
Looking at the putVal() method, this is pretty neat:
Summary of core steps:
- Performs the first judgment and initializes the underlying array;
- Add elements based on the result of hash calculation;
- According to the capacity of added elements to determine whether to expand;
There is one more thing to be said here:
HashMap is based on red-black tree to deal with hash conflicts. If hash conflicts are too many, the impact on O(n) query performance will be very large. When the number of conflicting elements in the linked list of conflicting nodes reaches 8 and the length of array reaches 64, red-black tree structure will be used to replace the linked list to deal with hash conflicts. The related article on tree structure can be moved before.
4. Automatic capacity expansion
The container can constantly add elements within a certain boundary, and its core mechanism is expansion. The expansion of HashMap follows the principle of minimum availability. Of course, when the capacity reaches the threshold, the automatic expansion mechanism will be triggered.
Threshold: Threshold =capacity x loadFactor. The default value is 16 x 0.75=12.
Core method: resize;
Summary of core steps:
- Determine the boundary parameters of capacity expansion: threshold;
- Core parameter calculation: capacity and threshold;
- Create a new empty array based on the new parameters;
- If the original array is null, the process can be understood as initialization.
- If the original array is not NULL, expand capacity and migrate data.
Obviously, array expansion will affect efficiency. Therefore, in daily development, the size of HashMap can be estimated in advance when using HashMap to ensure that the threshold value is greater than the number of stored elements, and multiple expansion operations can be avoided as much as possible.
5. Query elements
GetNode (); hash (); hash (); hash (); hash ();
6. Delete elements
RemoveNode delete method: first, calculate the hash value to find the node to be deleted, and then determine whether the index position is a red-black tree or a linked list structure, and execute their respective deletion processes respectively:
7. Supplementary notes
Here’s a quick note on two methods: hashCode() and equals(). It’s common to override hashCode when overriding equals.
Both of these methods can be used to compare two objects for equality, but there are hash values that conflict. In this case, two objects have hash values that conflict. Equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals equals
In the structure of a HashMap, if the hash value on the linked list is the same, the equals method is used to check whether the specific value is the same, so that the corresponding object can be found.
Source code address
Making address GitEE, https://github.com/cicadasmile/java-base-parent, https://gitee.com/cicadasmile/java-base-parentCopy the code