In the previous article “HashMap source code In detail”, we have combined the source code of HashMap in Java1.8 to comb out data structure, data access, data write, capacity expansion and other operations in detail.

HashMap is the basis of three data structures: HashSet, HashTable, and ConcurrentHashMap. In today’s article, we will introduce the source code of HashSet, HashTable, ConcurrentHashMap based on the “HashMap source in Detail”, and compare them with HashMap.

1 HashTable

HashTable has the closest relationship to HashMap and can be considered a thread-safe version of HashMap.

We still take Java1.8 as an example. After analyzing HashTable, we find that its operations such as read, write and expansion are basically the same as those of HashMap, but all methods are modified with synchronized keyword and changed into synchronous method.

1.1 Source Code Analysis

Instead of analyzing the source code of all methods, we will take the PUT method as an example. Its put methods for external public calls are as follows:

Public synchronized V put(K key, V value) {if (value == null) {throw new NullPointerException(); } // The element to be inserted does not already exist in the dynamic array Entry<? ,? > tab[] = table; Int hash = key.hashcode (); int index = (hash & 0x7FFFFFFF) % tab.length; @SuppressWarnings("unchecked") Entry<K,V> entry = (Entry<K,V>)tab[index]; for(; entry ! = null ; entry = entry.next) { if ((entry.hash == hash) && entry.key.equals(key)) { V old = entry.value; entry.value = value; return old; } // The element to be inserted does not already exist in the dynamic array. return null; }Copy the code

Comparing the HashMap source code we analyzed in the “HashMap source Detail”, we found two main differences.

The first is that the element to be inserted does not exist in the HashTable before the actual insert. In the HashMap, the corresponding operation is actually performed, but it is put into the subfunction of the insert operation, so the difference is negligible.

Second, when calculating the index of the inserted element key, the related hash value and position calculation are not divided into a submethod. This is mainly:

  • Because if you draw a one-step method, the frequency of operation of the submethod is very high, and the operation will often block here, affecting performance.
  • If you split a method out of step, it can cause concurrency problems when different methods are called.

Therefore, the best way to do this is to write it in each method, which is a way to trade space for performance.

Ok, let’s move on to the actual insertion method:

Private void addEntry(int hash, K key, V value, int index) {// similar to modCount++; Entry<? ,? > tab[] = table; if (count >= threshold) { // Rehash the table if the threshold is exceeded rehash(); tab = table; hash = key.hashCode(); index = (hash & 0x7FFFFFFF) % tab.length; } // Creates the new entry. @SuppressWarnings("unchecked") Entry<K,V> e = (Entry<K,V>) tab[index]; tab[index] = new Entry<>(hash, key, value, e); count++; }Copy the code

Where modCount is similar to the version number, which is used for pessimistic locks. If modCount changes during traversal, an error will be reported.

Count is the total number of stored elements. Used as a basis for operations such as capacity expansion.

The rest is the same as the HashMap operation.

1.2 contrast

The main differences between a HashTable and a HashMap are:

  • HashMap is thread-safe, HashTable is thread-safe. HashTable is thread-safe by placing synchronization locks on methods, which makes performance worse.
  • HashMap allows null values to be inserted, whereas HashTable does not. When null is inserted, HashTable throws a NullPointerException.
  • The default array size for HashMap is 16, and the default array size for HashTable is 11. The expanded capacity of HashMap is 2N, and the expanded capacity of HashTable is 2n+1, so that elements are more evenly distributed.

2 ConcurrentHashMap

Because HashTable is thread-safe, based on a synchronous approach, it is inefficient and therefore rarely used. HashMap does not support concurrent operations. So what do people use for concurrency? It’s called ConcurrentHashMap in this video.

The synchronization method in HashTable actually locks the entire HashTable object, and any operation locks the entire object. As a result, performance will be poor when operations become larger, or when the HashTable becomes larger.

ConcurrentHashMap takes a different approach and segments the entire array. Then each segment is protected synchronously, and only a small segment of data is locked at a time. As long as multiple operations are distributed on different segments, they can be safely performed concurrently. Thus improved performance.

2.1 Source Code Analysis

The source code for ConcurrentHashMap is more complex, but similar to the idea of HashMap, it adds segments. The default number of segments is 16. This means that up to 16 concurrent operations can be performed on different segments without causing collisions or blocking. Furthermore, the number of segments is not allowed to be modified after initial use.

Within each segment, it’s more like a HashMap. Initialization and expansion operations are performed for each segment. Arrays in each segment are expanded independently and may be of different sizes. Therefore, ConcurrentHashMap is a group of hashMaps aggregated together.

When inserting and reading, the corresponding segment is found first, and then the operation is carried out in the segment. The operation within a section is similar to a HashMap. For details, see HashMap source Code.

2.2 the characteristics of

We summarize the features of ConcurrentHashMap:

  • Thread-safe. In addition, the internal partition locking strategy is adopted, which is more efficient than HashTable.
  • As with HashTable, null values are not allowed to be stored.

3 HashSet

Why does analyzing hashMaps and related maps say HashSet? Because HashSet is based on HashMap!

First, we see that a HashMap is introduced in the HashMap:

private transient HashMap<E,Object> map;
Copy the code

The value we store in the HashSet is actually stored in the key of the HashMap, and the value is filled with the following objects:

private static final Object PRESENT = new Object();
Copy the code

Take a look at the initialization of a HashSet:

public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}
Copy the code

Create a HashMap. So everything else is based on HashMap. We won’t go into details.

4 summarizes

In this article, we analyze the source code of HashTable, ConcurrentHashMap and HashSet on the basis of the previous HashMap, and summarize their respective characteristics:

  • HashTable is thread-safe. The principle is to place synchronization locks on methods, so performance is worse.
  • ConcurrentHashMap is thread-safe. In addition, the internal partition locking strategy is adopted, which is more efficient than HashTable.
  • HashSet is implemented based on HashMap.