JDK7,8,9 hashmap, hashtable, concurrenthashmap and their differences

It’s as long as the title. It took a long time. If not specified, the source code is JDk1.7 (will be compared with 1.8 later)

1: Introduction to HashMap (as follows, array-linked list)

The storage structure of a HashMap

In the figure, the purple part represents a hash table, also known as a hash array (the default array size is 16, and each key-value pair is actually in the map’s internal class Entry). Each element of the array is the head node of a single linked list, followed by a green linked list used to resolve conflicts. If different keys map to the same location in the array, they are placed in a singly linked list using header interpolation.

2: HashMap principle (namely put and GET principle)

2.1 put the principle of

1. Obtain the hash value based on the key: int hash = hash (key.hash.hashcode ())

2. Determine the array argument int I = indexFor(hash, table.length) based on the hash value and array length. I = hash % modulo array length. If different keys all map to the same location in the array, they are placed in a singly linked list. And the new one is placed at the head node.

2.2 get principle

1. Hash the array position and traverse the list where the array is located (key.equals ())

3.1: HashCode is the same, what about conflicts?

“Head plug”, put to the head of the corresponding list.

3.2: Why is it head insertion (why is it designed this way)?

Since the inventors of HashMap thought that the last Entry was more likely to be found, it was placed at the head (because get () queries traverse the entire list).

4.1: What is the default array length for a HashMap?

The default is 16

4.2: Why?

16 was chosen to serve the hash algorithm that maps from key to index (see below).

5.1: What if hashMap achieves the default load factor (0.75)?

Automatically double capacity, after capacity expansion, recalculate each key-value pair position. The length must be a power of 16 or 2

5.2: Why powers of 16 or 2?

If the power is not 16 or 2, the results of the bit operation are not evenly distributed, which obviously does not comply with the principle of uniform distribution of the Hash algorithm.

In terms of Length 16 or any power of two, length-1 is a value where all binary bits are 1, in which case the result of index is the same as the value of the last few bits of HashCode. The result of the Hash algorithm is uniform as long as the input HashCode itself is evenly distributed.

6.1: Is hashMap thread safe?

It isn’t.

6.2: Why?

Because it’s not locked

6.3: What problems does that cause in concurrency?

When a Hashmap approaches a critical point where two or more threads perform a put operation, both resize and ReHash are performed, and ReHash may form a linked list loop in the case of concurrent operations. When a GET is executed, an infinite loop is triggered, causing 100% CPU problems.

Note: The hashmap issue has been fixed in JDK8. However, HashMap is still non-concurrent safe, and ConcurrentHashMap is used in concurrent cases.

6.4: How to judge the existence of ring table?

Optimal: First create two Pointers A and B (in Java, two object references) that point to the first node of the list. We then begin A large loop, in which we move pointer A down one node at A time and pointer B down two nodes at A time, and compare whether the two Pointers point to the same node. If they are the same, then the linked list is judged to have a ring, if not, then continue the next cycle.

In a circular track, two runners start from the same spot, one faster and the other slower. When two people run for some time, the faster runner will inevitably catch up and pass the slower runner again, simply because the track is circular.

7: What is the difference between hashMap and Hashtable?

Difference thread efficiency Array default value NULL value HashMap unsafe higher 16key-value both allowed HashTable safe slightly lower 11 Not allowed (throw exception)

8.0: What if hashmap is not secure and hashtable performance is low?

Concurrenthashmap ensures security and performance.

8.1: So what is concurrenthashMap?

The structure of the ConcurrentHashMap is as follows:

Concurrenthashmap is an array of segments. Concurrenthashmap is an array of segments. And what about Segment? The Segment itself is equivalent to a HashMap.

Like a HashMap, a Segment contains an array of Hashentries, each of which is both a key-value pair and the head node of a linked list.

The single Segment structure looks like this:

How many Segment objects like this are in the ConcurrentHashMap collection? There are two to the NTH of them, kept in an array called segments.

ConcurrentHashMap is, so to speak, a two-level hash table. Under a total hash table, there are several subhash tables. (By analogy, multiple HashMaps form a Cmap)

8.2: What about his put and get methods?

The Put method:

1. Hash the input Key to obtain the Hash value.

2. Locate the Segment object using the hash value

3. Obtain the reentrant lock

4. Locate the array position in the Segment using the hash value.

5. Insert or overwrite the HashEntry object.

6. Release the lock.

The Get method:

1. Hash the input Key to obtain the Hash value.

2. Locate the Segment object using the hash value

3. Locate the array position in the Segment using the hash value.

Therefore, compared with hashMap, ConcurrentHashMap requires secondary positioning for both reads and writes. Locate the Segment first and then the specific array subscript within the Segment.

9: What is the difference between hashMap and ConcurrenthashMap?

Thread: unsafe safe

10.1: Why ConCurrenthashMap and HashTable are both thread-safe, but the former has higher performance

The former uses a Segment lock, which locks the Segment object based on the hash value. If the hash value is different, the Segment object can be inserted in parallel, which is more efficient, while the hashtable locks the entire map.

How to understand parallel insertion: When cMAP requires a put element, it does not lock the entire map. Instead, it uses hashCode to know which Segment it wants to put in, and then locks that Segment. Therefore, when multiple threads are put, as long as they are not in the same Segment, it achieves true parallel inserts.

However, in order to obtain global information of concurrenthashMap, all segmental locks need to be obtained to obtain the statistics (i.e. less efficient).

10.2: What is the problem solved by the design of segmental lock?

Segmented locking is designed to refine the granularity of the lock. When the operation does not need to update the entire array, only a portion of the array is locked.

11: What are the differences between hashmap in JDK1.7 and hashmap in JDK1.8?

1. To speed up the query efficiency, Java8 hashMap introduces the red-black tree structure. When the array length is greater than the default threshold of 64 and the list elements are greater than 8, the list will be transformed into a red-black tree structure to improve the query efficiency. (The question is, what is a red-black tree? What is a B+ tree? What is a B tree? What is a binary search tree? Information on data structures will be updated in Topics on data structures, which are not covered here.

Here is a brief introduction to red-black trees:

Red-black tree is a self-balancing binary tree with excellent query and insert/delete performance. It is widely used in associative arrays. In contrast to AVL trees, AVL requires that the absolute value of the difference between the height of the left and right subtrees of each node (the balance factor) be at most 1, while red black trees limit the longest possible path from root to leaf to no more than twice the length of the shortest possible path by appropriately lowering this condition (red black trees limit the longest possible path from root to leaf to no more than twice the length of the shortest possible path, the result is that the tree is roughly balanced). This results in better performance by reducing the amount of balancing time required for insert/delete, which results in a slightly slower query for a red-black tree than AVL, but is clearly worth the effort in most cases compared to the time taken for insert/delete. Okay, I know you’re confused, so let’s move on to my Data Structures project.

2. Optimize the capacity expansion method to maintain the order in the original linked list during capacity expansion to avoid an infinite loop

What is the difference between JDK1.7’s concurrenthashmap and JDK1.8?

The implementation of 1.8 has abandoned the Segment Segment locking mechanism, using Node array +CAS+Synchronized to ensure the security of concurrent updates, the underlying array + linked list + red-black tree storage structure.

Java gives us ConcurrentHashMap, which is ConcurrentHashMap, and its implementation relies on the Java memory model, so we must understand some of the underlying knowledge before we get to ConcurrentHashMap:

Java memory model

In Java Unsafe

Java in the CAS

Java synchronizer AQS

ReentrantLock

So INSTEAD of going into ConcurrentHashMap, I’ll go through the basics of concurrency step by step, from the Java memory model, synchronized, volatile, Unsafe, CAS, AQS, various locks, to JUC and packet distribution.

Let’s start with a mind map of the Java memory model, which has so much to say in just one point.

13: So the question is, what is CAS?

About THE CAS knowledge, and will involve ABA problems, and can talk about optimistic lock pessimistic lock, lock programming, AQS, etc., related content will be updated in [Concurrent programming topics], do not expand here

Fourteen: What about the 1.9?

Took aim at a glance, seems to have no what difference with 1.8, do not expand here…. (Don’t hit your face)

JDK7,8,9 hashmap, hashtable, concurrenthashmap and their differences

Related Posts

Jedis connection exception analysis

Spring-boot – Write your own starter

Thread pool source code, interview also afraid of being asked about the thread pool