Let’s start by looking at what are the concurrent containers

With all these containers, what are we gonna do? We can’t have all of them, but we can look at them all and choose what suits us.

Map under concurrent operation

We all know that HashMap can no longer be used in concurrent scenarios because, prior to JDK7, the use of HashMap in concurrent scenarios resulted in an infinite loop that led to high CPU usage, and capacity expansion was the main cause of the infinite loop. While Java fixes the dead-loop problem caused by HashMap expansion in JDK8, data loss and inaccuracies can still occur in high concurrency scenarios.

See my previous article for an analysis of the HashMap expansion loop

To ensure thread safety, the safe Map objects include Hashtable, ConcurrentHashMap, and ConcurrentSkipListMap

Both Hashtable and ConcurrentHashMap are implemented based on HashMap and have advantages in accessing small amounts of data. ConcurrentSkipListMap is implemented based on hop table and is characterized by an average time complexity of O(log(n)). This method is applicable to scenarios where large amounts of data are accessed.

Hashtable VS ConcurrentHashMap

Hashtable adds Synchronized to all its methods of obtaining or modifying data. Therefore, in high-concurrency scenarios, read and write operations will have a large number of lock contention, bringing performance costs to the system.

ConcurrentHashMap provides better concurrency performance while maintaining thread safety. In JDK7, ConcurrentHashMap uses segment-based lock segments to reduce lock granularity and ultimately optimize concurrent lock operations. In JDK8, ConcurrentHashMap made a number of changes to get rid of the concept of Segment. Since Synchronized locks have greatly improved performance since Java6, Java has reenabled Synchronized synchronization in JDK1.8, implementing HashEntry as a lock granularity through Synchronized. This change makes the data structure simpler and the operation clearer and smoother.

Like THE PUT method in JDK7, WHEN adding elements, CAS is used in JDK8 without hash conflicts. If there is a conflict, the list is locked with Synchronized and the next steps are performed.

Although The overall performance of ConcurrentHashMap is better than that of Hashtable, ConcurrentHashMap cannot replace Hashtable in some scenarios. For example, ConcurrentHashMap is not applicable in strongly consistent scenarios, because methods such as get and size in ConcurrentHashMap do not use locks, so the data returned is not accurate. ConcurrentHashMap is weakly consistent. Therefore, a read may fail to obtain the written data immediately

ConcurrentHashMap VS ConcurrentSkipListMap

We all know that when the ConcurrentHashMap is large, the linked list will be converted to a red-black tree. In the case of red-black tree concurrency, there is a balanced process between deletion and insertion, which involves a large number of nodes, so the cost of competing for lock resources is relatively high.

ConcurrentSkipListMap, which is implemented based on hop tables, requires fewer nodes to be locked and performs better in high-concurrency scenarios.

A skip list is a special kind of list based on the extension implementation of a linked list. Similar to the implementation of a tree, a skip list not only realizes the horizontal list, but also realizes the hierarchical index in the vertical direction. A skip list is composed of several layers of linked lists, and each layer implements an ordered linked list index. Only the lowest layer contains all the data, and each layer points to elements with the same value on the upper layer through a pointer from bottom to top. The data on each layer is reduced successively, and only part of the data will be retained when it reaches the top layer.

The structure of skip list improves the query efficiency by using the method of space for time. The program always starts at the top of the query access, by determining the element value to narrow the query.

The Java implementation will randomly calculate the level value according to the probability when inserting data into the hop table, and then rebuild the index layer. The code here is quite interesting


// Generate a random number
int rnd = ThreadLocalRandom.nextSecondarySeed();

/ / 0 x80000001 binary for is 10000000000000000000000000000001 ‬ ‭
// Only if the random number is positive or even can we decide whether to add new levels
if ((rnd & 0x80000001) = =0) {
  int level = 1, max;
  // The level value is incremented when there are several consecutive 1s starting from the second random number
  while (((rnd >>>= 1) & 1) != 0)
      ++level;

  Index<K,V> idx = null;
  HeadIndex<K,V> h = head;
  // The number of layers must be less than or equal to the maximum level of the current hop table
  if (level <= (max = h.level)) {

      // Create down index from level 1
      for (int i = 1; i <= level; ++i)
          idx = new Index<K,V>(z, idx, null);
  }
  else { 
    // Add one more layer to the existing table
    level = max + 1; . }// omit the rebuild index layer code
}
Copy the code

When a new node with the key value of 7 is added, a node is first added to the lowest linked list, the level value is calculated according to the probability, and then the index layer is created according to the level value. Finally, the new node of the index layer is linked. New nodes and linked indexes are implemented based on CAS operations

When a key is deleted, the node to be deleted is first found and its value is set to NULL through CAS. Then add a marker node to the next position of the node to be deleted in order to reduce concurrent conflicts. Then, the precursor node of the node to be deleted directly points to its successor node over the node to be deleted, and the nodes to be deleted in the middle will eventually be disposed of by JVM garbage collection. Finally, determine whether there are no other nodes in a certain index layer after this deletion. If there are no nodes in this layer, the number of hops is degraded.

After the above analysis, we can know that ConcurrentHashMap is used when we do not need to know the exact data in the set, and HashTable is used when we need to know the exact number of data in the set. ConcurrentSkipListMap can be used if the amount of data is very large and there are a lot of add, delete, change, or query operations.

Note that none of the three thread-safe container classes is allowed to be null, whereas the HashMap can be null for key and value.

Concurrent List

Vector is also known to use the synchronized keyword in all its exposed public methods to ensure thread-safety, so in scenarios where reads are much larger than writes, Vector will have a lot of lock contention, resulting in performance overhead for the system.

CopyOnWriteList implements read operations without locking, and write operations are implemented by operating on a new copy of the underlying array. It is a concurrent strategy of read and write separation. Let’s see how the source code is implemented


private transient volatile Object[] array;

public E get(int index) {
  return get(getArray(), index);
}
private E get(Object[] a, int index) {
    return (E) a[index];
}
Copy the code

As you can see, the read operation is unremarkable, just reading elements from an array.

Now look at write operations

public E set(int index, E element) {
  final ReentrantLock lock = this.lock;
  // exclusive lock lock
  lock.lock();
  try {
    // Get the original array of stored data
    Object[] elements = getArray();
    E oldValue = get(elements, index);

    // Set new data
    if(oldValue ! = element) {int len = elements.length;

      // Copy the old data to a new array
      Object[] newElements = Arrays.copyOf(elements, len);

      // Inserts the data into the new array, and sets the new array to store the data
      newElements[index] = element;
      setArray(newElements);
    } else {
      // Reset data to ensure volatile write semantics
      setArray(elements);
    }
    return oldValue;
  } finally{ lock.unlock(); }}Copy the code

So CopyOnWriteList is when you need to write data, copy a new array, write the data into the new array, and assign the new array to the old array.

Therefore, CopyOnWriteList is more suitable for operations that read much more than write, and business scenarios do not require real-time access to written data, just ensure that the data in the array can be finally obtained.

Set in parallel

CopyOnWriteArraySet is implemented using CopyOnWriteArrayList and ConcurrentSkipListSet is implemented using ConcurrentSkipListMap, and the difference between the two can be seen in their respective implementations. I will not repeat the explanation here.

Queue under concurrency

The most complicated thing about Java and sending concurrent containers like Queue is not that the implementation is complicated, but that it has a lot of analogies and is hard to remember. But it can still be classified in two dimensions.

One dimension is blocking and non-blocking. The so-called blocking refers to the queuing operation blocking when the queue is full. The dequeue operation blocks when the queue is empty.

The other dimension is single-end and double-end. Single-end means that only the last team can enter the team and the first team can leave the team. And the double end refers to the first team and the end of the team can enter the team out of the team.

When sending packets in Java, the Blocking Queue is identified with the Blocking keyword, the single-ended Queue is identified with the Queue, and the double-ended Queue is identified with the Deque.

Both dimensions can be combined to subdivide the Queue into four categories.

Single-end blocking queue

A single-ended blocking queue typically holds a queue, which can be an ArrayBlockingQueue or a LinkedBlockingQueue. It is even possible to hold no queue (its implementation is SynchronousQueue), where producer thread enqueueing must wait for consumer thread enqueueing.

Note that the queue blocks only when put/take is called, while add and offer do not block.

ArrayBlockingQueue uses arrays to implement queues first, and ReentrantLock and Condition to implement blocking.

LinkedBlockingQueue is a list that implements queues, Blocking is also implemented using ReentrantLock and Condition, and differs from ArrayBlockingQueue in that LinkedBlockingQueue uses a different put and take ReentrantLock, which is done because a put adds elements to the end of the queue and a take removes elements from the head of the queue.

SynchronousQueue also occurs in thread pools as a blocking queue, used to transfer elements directly between two threads, one producing data before the other can consume it. Although it has no internal buffering, if there are too many threads in the same pattern (production or consumption), they are stored and then blocked, so it is better used when production and consumption rates are equal.

PriorityBlockingQueue supports queue selection by priority, and its underlying implementation is PriorityQueue + ReentrantLock, PriorityQueue can refer to my PriorityQueue source code analysis

DelayQueue is a blocking queue that supports Delayed acquisition of elements using PriorityQueue + ReentrantLock, and the elements must implement the Delayed interface. You can specify when you create an element how long it takes to get the current element from the queue, and only when the delay expires. We can use it to implement timing scheduling and caching systems.

LinkedTransferQueue Is a combination of LinkedBlockingQueue and SynchronousQueue that performs better than LinkedBlockingQueue. Its underlying data structure is dual Queue. Consumers thread element, if the queue is empty, it will generate a squad of node (the node element is null), then the consumer thread park, behind the producer thread when the team discovered that a element is null node, the producer thread won’t be the team, just fill of element to the node, awaken the park live threads on this node, The awakened consumer thread retrieves data from the node. If the producer inserts an element, if the queue is empty, a data node is generated to join the queue and immediately returns without blocking.

A double-ended blocking queue

The implementation of a double-ended blocking queue is LinkedBlockingDeque, which is similar to the implementation of LinkedBlockingQueue, except that this one is double-ended.

Single-ended non-blocking queues

The implementation is ConcurrentLinkedQueue, which inserts data from the end of the queue and fetches data from the end. CAS is used to update the direction of head and tail, so the thread will not block, but if the concurrency is too fierce, the CPU will idle.

A double-ended non-blocking queue

The implementation is ConcurrentLinkedDeque. This is similar to the ConcurrentLinkedQueue implementation, except that it is a two-ended queue.

From the above analysis, we can also know that blocking uses locks and blocks threads. Non-blocking uses CAS, which does not block threads, but can cause excessive CPU usage in the case of high concurrency.

When using queues, you need to pay special attention to whether queues support boundedness (by boundedness, you mean whether internal queues have capacity limits). In practice, it is generally not recommended to use unbounded queues, because large data volume can easily lead to OOM.

Of the queues mentioned above, only ArrayBlockingQueue and LinkedBlockingQueue support bounded queues, so when using the other unbounded queues, be sure to consider whether there is a risk of OOM.

Wrote last

After analyzing the features and implementations of these containers, I’m sure you have your own choice.