This article is participating in the Java Theme Month – Java Debug Notes Event, see the event link for details

Data Structure

First of all, we need to understand the structure of a HashMap

This is the data structure for the HashMap, which uses arrays plus linked lists

Two key values are generated

How is key generated? If you know a little bit about it, but you haven’t seen the source code, you probably think it has something to do with the hashCode method, but what does the source code do

static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }Copy the code

So what is this function? This thing is technically called a perturbation function, and again, why does it do that? We saw that in this function we did a hashCode, but before we were done, we did xor and shift 16 bits to the right, so the question is, why do we do that?

The key range generated by hashCode is so large [-2147483648, 2147483647] that it is nearly 4 billion in length. Subsequent processing is to reduce this length and make the data more hashed.

The initial size of the HashMap

What is the initial size of a HashMap? I believe some of you will say 16, and many of you don’t know why? Because they haven’t been exposed to expansion. So that gives us another scientific name: initialize capacity, which must be a multiple of two, and will be a multiple of two after expansion, because only a multiple of two -1 will appear in binary as 0001. The following provides a calculation after the expansion of the function source, the scientific name of the calculation threshold size method

static final int tableSizeFor(int cap) { int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; }Copy the code

Capacity expansion also refers to a scientific name load factor. The default load factor is 0.75, which means that the array is filled to 3/4 of the capacity.