HashMap is one of the most commonly used data structures in our daily work. Its implementation principle is very simple, the internal is an array and linked list of data structure, used for data storage, search, delete and so on. Something like this:
If I ask you what is the time complexity of storing and retrieving data in a HashMap?
I’m sure many of you will blurt out, “O(1), hash is O(1).”
Is that the right answer? It’s not true, or it’s only half true. Why do you say that? Because O(1) is an answer that only happens in the best case, when there are no hash conflicts at all. In this case, there is only one value in each hash bucket. As long as the hash function is computed, the location of the target element can be located instantly.
Unfortunately, O(1) is only a theoretical luxury, hash conflict is the normal life.
In short, hash conflicts have long been a sore point with HashMap, a major obstacle to the high-performance good life.
In short, the performance of HashMap is low because of the following:
- Hash conflicts cause too many elements in a single hash bucket. The time complexity of the operation elements even degenerates to O(N), which is also O(logN) after the red-black tree is improved.
- Expand? Why expand? Or to reduce hash collisions!
The problem is found, that is to say, to improve the performance of HashMap, the following two aspects can be mainly started:
1. Reduce hash conflicts and make elements more evenly distributed in the hash bucket.
So how do you reduce hash collisions?
The hash function inside a HashMap looks like this.
//n is the length of the internal array, hash is the hashcode of the added key element (which has been processed twice)
i = (n - 1) & hash
Copy the code
This bit operation, essentially a mod operation, is equivalent to:
i = hash % n;
Copy the code
The more evenly distributed the hash of the remainder is, the more incremented the hashcode of the element key is added. For example, if the internal array length of a HashMap is 8, and the elements with key values of hashcode 0-15 are added, the distribution is very uniform, as follows:
What are the implications here? I mean, on a daily basis,Use HashCode incremented values for keys, such as incremented int values, to minimize hash collisions.
The problem I encountered in my work was something like this:
public enum AttributeType {
TYPE1(1),
TYPE2(2),
TYPE3(3),
TYPE4(4),;int value;
AttributeType(int value) {
this.value = value; }}Copy the code
Examples of such enumerations of attribute types in games are numerous, and can number in the hundreds. We often use attribute types as keys to store computational data. This involves a common choice. Which of the following two storage methods is better?
(1Map<Integer, Long> attributeMap =newHashMap<>(); (2) Map<AttributeType, Long> attributeMap =new HashMap<>();
Copy the code
According to our previous theory, obviously the first good one! Because the key is incremented, the elements are evenly distributed, whereas the second Hashcode is not necessarily incremented, and the distribution is definitely not as uniform as the first. After testing, the results actually confirmed my guess.
Now, some of you might think, is it necessary to be so critical? In fact, this is necessary, because the Map structure is extremely frequent when calculating battles in the game, and a little optimization can pay a lot of money.
2. When creating a Map, specify the initial capacity to prevent capacity expansion
At work, we often have code like this:
Map<Integer, Long> attributeMap = new HashMap<>();
for (AttributeType type : values()) {
long attributeValue = 1000;
attributeMap.put(type.value, attributeValue);
}
Copy the code
If the initial capacity is not specified, multiple capacity expansion may occur, affecting performance. So I recommend that you specify the initial capacity if you know the number of elements, as follows:
initCapcity = size / 0.75 + 1;
Copy the code
This reduces performance costs associated with capacity expansion.
The above content is original, quote please indicate the source! Welcome to criticize and correct any inaccuracies!