1. Proposed several problems of HashMap
Read JDK8 source code small torch certainly in the first read after there will be a doubt, the main question in the following several problems:
2. Why hashCode()?
Hash (Object key); hash(Object key);
static final int hash(Object key) {
int h;
return (key == null)?0 : (h = key.hashCode()) ^ (h >>> 16);
}
Copy the code
2.1 Time of deposit
Once you get the hash, you can directly locate the index in the array with a simple bit operation. O(1) is the fastest way to store the index of an array (if there is a hash conflict, then call equals()).
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
Copy the code
2.2 Fetching time
(if there is a hash conflict, this is also the fastest way to get an element from equals().)
(first = tab[(n - 1) & hash]) ! =null) {
Copy the code
2.3 hashCode ()
Using hashCode() is the fastest way to access both sides of the plane. No one, if you use the way of traversing the number group to find, the day lily is cold, O(n) brother.
3. Why is the initial capacity 16? Or why, given an initial capacity size in the constructor, does it still modify it to be greater than or equal to the lowest power of two to the NTH?
- Where is the capacity used? This line of code is used when fetching:
tab[i = (n - 1) & hash]
Copy the code
As mentioned above, a simple bit operation is done after getting the hash. Let’s analyze this as an operation:
i = (n - 1) & hash
Copy the code
First give the conclusion:
- In order to keep the subscript from crossing the boundary
- In order to distribute the hash evenly
3.1 In order not to let the subscript out of bounds
When capacity = n to the power of 2, this is the case for n-1 binary. Let’s look at binary: n= 1000… // start with 1 followed by n zeros. The n – 1 = 0111… // Start with 0 and end with 1
The decimal system | N binary | N – 1 binary |
---|---|---|
2 | 10 | 01 |
4 | 100 | 011 |
8 | 1000 | 0111 |
16 | 10000 | 01111 |
32 | 100000 | 011111 |
64 | 1000000 | 0111111 |
128 | 10000000 | 01111111 |
256 | 100000000 | 011111111 |
… | … | … |
(n-1) & hash :& means and. If both values are 1, the result of the ampersand operation can never be greater than the value of (n-1), i.e., never greater than or equal to capacity. The maximum index is Capacity-1.
3.2 To distribute hash evenly
& : The result is 1 only when the corresponding is 1.
I = (n-1) hash = I hash = I hash = I hash = I hash = I hash = I hash = I hash
Counter example: if the value of n is not the form of the n power of 2, n-1 will appear binary end is 0, that is: n-1 = ***0 (front n can be 0 or 1), this will cause a life phenomenon?
The index will never fall on an index ending in 1 below the binary. Half of the indexes are idle, doubling the probability of hash collisions on indexes ending in 0 (binary). There is a loss of efficiency in inserts and queries.
4. TableSizeFor () method parsing
Source:
/** * Returns a power of two size for the given target capacity. */
static final int tableSizeFor(int cap) {
int n = cap - 1;
n |= n >>> 1;
n |= n >>> 2;
n |= n >>> 4;
n |= n >>> 8;
n |= n >>> 16;
return (n < 0)?1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}
Copy the code
Is greater than or equal to cap to the NTH power of 2:
cap | The return value |
---|---|
1 | 2 |
5 | 8 |
10 | 16 |
22 | 32 |
50 | 64 |
- First, why cap-1:
int n = cap - 1;
Copy the code
The purpose of reassigning cap-1 to n is to find a target value greater than or equal to the original value. For example, the value is 1000 in binary and 8 in decimal. If you don’t subtract 1 from it, you get the answer 10000, which is 16. Obviously not the result. If you subtract 1 from the binary to 111, you get the original value of 1000, or 8. To reanalyze the highlight, here are the lines:
n |= n >>> 1; Said: n = (n | n > > > 1) | or: corresponding for as long as there are several 1Copy the code
- Analysis begins: hypothesis
- The binary value of n is 01XXXXXXXXxxxxxx…
- Move right one bit: 001XXXXXXXXXXXXXX…
- And n or the value of n: 011XXXXXXXXXXxxxx…
- By the same token, the n | = n > > > 2. After the row of n = 01111 XXXXXXXXXXXXXXXXXXXXXXXXXXXX
- By the same token, the n | = n > > > 4. After the row XXXXXXXXXXXXXXXXXXXXXXXX n = 011111111
- By the same token, the n | = n > > > 8; After the row of n = 01111111111111111 XXXXXXXXXXXXXXXX
- By the same token, the n | = n > > > 16; After the row of n = 011111111111111111111111111111111
- If you’re familiar with this, the return is incremented by 1, and you’re back to 100000… , followed by all zeros, that is, this method returns 2 to the NTH power.
4 times 31
In order to make hash more evenly distributed, it is prime.
public int hashCode(a) {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
Copy the code
Why choose 31 in particular? Please refer to: [JVM] science: why String hashCode methods select number 31 as multiplier https://blog.csdn.net/happydecai/article/details/80493237
Share to this end, there are incorrect please correct.