Use of concurrent containers: Identifying the optimal containers for different scenarios Map containers for concurrent scenarios Suppose we now design a simple function for an e-commerce system to count the TOP 10 items sold. Normally, we use a hash table to store key-value pairs of items and sales, and then use a sort to get the top 10 items sold. Here, the hash table is the key to this functionality. So think, if you were to design this feature, which container would you use?

Never use HashMap in concurrent scenarios. Prior to JDK1.7, the use of HashMap in concurrent scenarios resulted in an infinite loop that led to high CPU utilization, and capacity expansion was the main cause of the infinite loop. While Java has fixed the dead-loop problem caused by HashMap expansion in JDK1.8, data loss and inaccuracies can still occur in high concurrency scenarios.

To ensure thread-safe containers, Java implements Hashtable, ConcurrentHashMap, and ConcurrentSkipListMap.

Hashtable and ConcurrentHashMap are implemented based on HashMap and have advantages for accessing small amounts of data.

ConcurrentSkipListMap is based on TreeMap design principles, but the former is based on the hop table implementation, the latter is based on red black tree implementation, ConcurrentSkipListMap features O (log (n)) average access time complexity, It is applicable to scenarios where large amounts of data are accessed. The most common cache is based on skip tables.

Going back to the original example, if the total quantity of goods in the e-commerce system is not very large, we can use Hashtable or ConcurrentHashMap to implement the function of Hashtable.

Hashtable 🆚 ConcurrentHashMap If data is constantly written and deleted and there is no data accumulation or sorting, you can use Hashtable or ConcurrentHashMap.

Hashtable uses Synchronized locks to modify methods such as PUT, get, and remove. Therefore, in high concurrency scenarios, read and write operations have a large number of lock contention, resulting in system performance overhead.

Compared to Hashtable, ConcurrentHashMap provides better concurrency performance while maintaining thread safety. In JDK1.7, ConcurrentHashMap uses segmental lock segments to reduce lock granularity and ultimately optimize concurrent lock operations.

In JDK1.8, ConcurrentHashMap made a number of changes to get rid of the concept of Segment. Since Synchronized locks have greatly improved performance since Java6, Java has reenabled Synchronized synchronization in JDK1.8, implementing HashEntry as a lock granularity through Synchronized. This change makes the data structure simpler and the operation clearer and smoother.

Like the PUT method in JDK1.7, CAS is used to add elements in JDK1.8 without hash conflicts. If there is a conflict, the list is locked with Synchronized and the next steps are performed.

To sum up, we preferred ConcurrentHashMap when designing the TOP10 selling features.

However, it is important to note that ConcurrentHashMap does not replace Hashtable in some scenarios, although it performs better overall than Hashtable. For example, ConcurrentHashMap is not applicable in strongly consistent scenarios, because methods such as get and size in ConcurrentHashMap do not use locks, and ConcurrentHashMap is weakly consistent. Therefore, a read may fail to obtain the written data immediately.

ConcurrentHashMap 🆚 ConcurrentSkipListMap ConcurrentHashMap The linked list is converted to a red-black tree when there is a large amount of data. In the case of red-black tree concurrency, there is a balanced process between deletion and insertion, which involves a large number of nodes, so the cost of competing for lock resources is relatively high.

Skip lists operate locally and have fewer nodes to lock, so performance is better in concurrent scenarios. In a non-thread-safe Map, there is no SkipListMap based on a skip list implementation, because in a non-thread-safe Map, TreeMap based on a red-black tree implementation does not perform worse in a single thread than a skip list.

TreeMap container is used to access big data in a non-thread-safe Map container. The SkipListMap container is used to access large data in a thread-safe Map container.

What is a skip list? A skip list is a special type of list based on an extended implementation of a linked list. Similar to a tree implementation, a skip list implements not only a horizontal list but also a hierarchical index in a vertical direction.

A skip list is composed of several layers of linked lists, and each layer implements an ordered linked list index. Only the lowest layer contains all the data, and each layer points to elements with the same value on the upper layer through a pointer from bottom to top. The data on each layer is reduced successively, and only part of the data will be retained when it reaches the top layer.

The structure of skip list improves the query efficiency by using the method of space for time. The program always starts at the top of the query access, by determining the element value to narrow the query.

If data is strongly consistent, use Hashtable. Use ConcurrentHashMap when most scenarios are generally weakly consistent. ConcurrentSkipListMap can be used if the data volume is in the tens of millions and there are a lot of add, delete, or modify operations.

ArrayList is a non-thread-safe container, and its use in concurrent scenarios is likely to cause thread-safety problems. At this point, consider using the thread-safe arrays that Java provides for concurrent programming, including Vector and CopyOnWriteArrayList.

Vector is also thread-safe based on Synchronized. The Synchronized keyword modifiers almost all exposed methods. Therefore, in operation scenarios where reading is much larger than writing, a large number of lock competitions will occur in Vector, which will bring performance overhead to the system.

In contrast, CopyOnWriteArrayList is a method provided by the java.util.Concurrent package that implements lockless reads and writes by operating on a new copy of the underlying array, a concurrent strategy of read-write separation.

The blacklist is an operation that reads far more than writes. We can update the list at a certain idle time.

This scenario does not require real-time access to write data, so we just need to ensure that the user ID is eventually retrieved from the write array, and CopyOnWriteArrayList, a concurrent array container, is best suited for this scenario.

conclusion