These questions are almost mandatory in an interview

This article is the third in a series of most Common Java interview questions. Main Contents:

Arraylist is similar to LinkedList

ArrayList is different from Vector

The underlying implementation of HashMap

Difference between HashMap and Hashtable

Why is the length of a HashMap a power of 2

HashSet is different from HashMap

ConcurrentHashMap and Hashtable

ConcurrentHashMap Thread-safe implementation/low-level implementation

Summary of the underlying data structure of the collection framework

This article will be updated synchronically in my open source Java learning Guide repository Java-Guide (a cover most Java programmers need to master the core knowledge, is gradually improving step by step, looking forward to your participation), address: github.com/Snailclimb/… , welcome star, issue and PR.

Arraylist is similar to LinkedList

1. Thread safety: ArrayList and LinkedList are not synchronized, i.e. not thread safe;
2. The underlying data structure: Arraylist uses the Object array at the bottom; The underlying LinkedList uses a two-way circular LinkedList data structure;
3. Whether insert and delete are affected by element position: ① Arraylists are arrays, so the time complexity of inserting and deleting elements depends on their location.For example: executeadd(E e)Method, the ArrayList defaults to append the specified element to the end of the list, which is O(1) time. But if you want to insert and delete elements at position I (add(int index, E element)The time is O(n-i). Because all the (n-i) elements after the ith and ith elements in the set are moved backwards/forwards by one bit. 2.LinkedList is stored in a LinkedList, so the time complexity of inserting and deleting elements is not affected by the location of elements, and is approximately O (1) while the array is approximately O (n).
4. Support fast random access:LinkedList does not support efficient random element access, whereas ArrayList implements the RandmoAccess interface, so it has random access. Fast random access is the quick retrieval of an element object by its ordinal number (corresponding toget(int index)Methods).
5. Memory footprint: The space waste of ArrayList is mainly reflected in the amount of space reserved at the end of the List, while the space cost of LinkedList is reflected in the amount of space consumed by each element of ArrayList (due to the storage of direct successors and direct precursors and data).

Added: Bidirectional linked lists for data structure basics

A bidirectional linked list, also known as a doubly linked list, is a type of linked list in which each data node has two Pointers pointing to the direct successor and direct precursor respectively. So, starting from any node in a bidirectional linked list, you can easily access its predecessors and successors. In general, we construct a bidirectional circular LinkedList, as shown in the figure below, and the underlying LinkedList data structure is bidirectional circular LinkedList.

ArrayList is different from Vector

All methods of the Vector class are synchronized. It is possible for two threads to access a Vector safely, but for one thread to access a Vector, the code takes a lot of time to synchronize operations.

Arraylist is not synchronous, so it is recommended when thread safety is not required.

The underlying implementation of HashMap

JDK1.8 before

Before JDK1.8, a HashMap consists of arrays and linked lists. Arrays are the body of a HashMap, while linked lists are used to resolve hash collisions. If the location of the array does not contain a linked list (the next of the current entry points to NULL), the search, add, and other operations are quick, requiring only one address. If the array to be located contains a linked list, the time for the add operation is still O(1) because the latest Entry is inserted into the head of the list, requiring a simple change of the reference chain. For the lookup operation, it is necessary to traverse the list and then compare the search through the equals method of the key object.

The so-called “zipper method” is a combination of linked lists and arrays. That is, create an array of linked lists, where each cell is a linked list. If a hash conflict is encountered, add the value of the conflict to the linked list.

After JDK1.8

Compared to previous versions, JDK1.8 has made a major change in resolving hash conflicts by converting lists to red-black trees when their length is greater than the threshold (8 by default) to reduce search time.

TreeMap, TreeSet, and HashMap after JDK1.8 all use red-black trees at their underlying level. Red-black trees are designed to solve the problem of binary search trees, which degenerate into a linear structure in some cases.

Recommended reading:

“8 series of Java HashMap rethink” : zhuanlan.zhihu.com/p/21673805

Difference between HashMap and Hashtable

Thread safety:HashMap is thread-safe, HashTable is thread-safe; The internal methods of HashTable basically pass throughsynchronizedModification. (Use ConcurrentHashMap if you want to be thread-safe!) ;
Efficiency: Because of thread-safety issues, HashMap is a little more efficient than HashTable. Also, HashTable is largely obsolete, so don’t use it in your code;
Support for Null keys and Null values: In a HashMap, Null can be used as a key. There is only one such key, and there can be one or more keys corresponding to the value Null. A NullPointerException is thrown whenever a key value in a HashTable contains a NULL.
(1) If you do not specify an initial capacity when creating a Hashtable, the default initial capacity of the Hashtable is 11. After each expansion, the original capacity is 2N +1. The default initialization size of a HashMap is 16. After each expansion, the capacity is doubled. (2) If you create a Hashtable with an initial capacity, then the Hashtable will use the given size, and the HashMap will expand it to a power of 2. That is, a HashMap always uses a power of 2 as its hash table size, and I’ll explain why later.
Underlying data structure: HashMap since JDK1.8 has a major change in resolving hash collisions, converting lists to red-black trees to reduce search time when the list length is greater than the threshold (8 by default). Hashtable has no such mechanism.

Why is the length of a HashMap a power of 2

In order for HashMap access to be efficient, there should be as few collisions as possible, that is, data should be distributed as evenly as possible, with each list/red-black tree roughly the same length. The implementation is the algorithm that stores the data in the linked list/red-black tree.

How should this algorithm be designed?

The first thing we might think of is doing % mod. Hash %length== Hash &(leng-1) hash%length==hash&(leng-1) if length is 2 to the n; .” And the binary operation & is more efficient than %, which explains why the length of a HashMap is a power of 2.

HashSet is different from HashMap

ConcurrentHashMap and Hashtable

The main difference between ConcurrentHashMap and Hashtable is in the way thread-safe is implemented.

ConcurrentHashMap in JDK1.7 uses a partitionalized array + linked list. The data structure used in JDK1.8 is the same as that in HashMap1.8: array + linked list/red-black binary tree. The underlying data structure of Hashtable and HashMap before JDK1.8 is similar to that of array + linked list. Array is the body of HashMap, while linked list is mainly used to solve hash conflicts.
Thread-safe approach (important) : (1) In JDK1.7, ConcurrentHashMap segments the entire bucket array. Each lock locks only one part of the container data. When multithreading accesses different data segments in the container, there will be no lock competition and improve the concurrent access rate. (The default allocation of 16 segments is 16 times more efficient than Hashtable.) In JDK1.8, the concept of Segment has been abandoned, but directly using Node array + linked list + red-black tree data structure to achieve concurrency control using synchronized and CAS to operate. The whole thing looks like an optimized, thread-safe HashMap. Although you can still see the Segment data structure in JDK1.8, the attributes have been simplified to make it compatible with older versions. ② Hashtable: Synchronized is an inefficient way to ensure thread safety. When one thread accesses the synchronous method, other threads also access the synchronous method, and may enter the blocking or polling state. For example, if you use PUT to add elements, another thread cannot use PUT to add elements, nor can it use GET. The competition becomes increasingly fierce and the efficiency becomes lower.

A comparison of the two:

Photo: www.cnblogs.com/chengxiao/p…

HashTable:

JDK1.7 ConcurrentHashMap:

ConcurrentHashMap Thread-safe implementation/low-level implementation

JDK1.7 (schematic above)

First, the data is divided into sections for storage, and then each section of data is assigned a lock. When a thread uses the lock to access one section of data, the data of other sections can also be accessed by other threads.

ConcurrentHashMap consists of the Segment array structure and the HahEntry array structure.

Segment implements ReentrantLock, so Segment is a ReentrantLock that acts as a lock. HashEntry is used to store key-value pair data.

static class Segment<K.V> extends ReentrantLock implements Serializable {}Copy the code

A ConcurrentHashMap contains an array of segments. The structure of a Segment is similar to that of a HashMap. It is an array and a linked list structure. Each Segment contains a HashEntry array. When modifying the data in a HashEntry array, you must first acquire the lock on the corresponding Segment.

JDK1.8

ConcurrentHashMap eliminates Segment Segment locking and uses CAS and synchronized to ensure concurrency security. The data structure is similar to the structure of HashMap1.8, array + linked list/red-black binary tree.

Synchronized only locks the first node of the current linked list or red-black binary tree. In this way, as long as hash does not conflict, concurrency will not occur and efficiency will be improved by N times.

Summary of the underlying data structure of the collection framework

Collection

1. List

Arraylist: Object array
Vector: an Object array
LinkedList: a two-way circular list

2. Set

HashSet (unordered, unique) : Implemented based on HashMap, the underlying use of HashMap to store elements
LinkedHashSet: LinkedHashSet inherits from HashSet and is internally implemented through LinkedHashMap. It’s a bit like LinkedHashMap, which is internally implemented based on Hashmap, but with a slight difference.
TreeSet (ordered, unique) : red-black tree (self-balanced sorted binary tree)

Map

A HashMap: In JDK1.8, a HashMap consists of an array, which is the body of a HashMap, and a list, which is primarily used to resolve hash collisions. Transform linked lists into red-black trees to reduce search time
LinkedHashMap: LinkedHashMap inherits from HashMap, so the underlying LinkedHashMap is still based on a pulled hash structure consisting of arrays and linked lists or red-black trees. In addition, LinkedHashMap adds a two-way linked list to the above structure, allowing the above structure to maintain the insertion order of key-value pairs. At the same time, the related logic of access order is realized by the operation of linked list. LinkedHashMap (JDK1.8)
HashTable: An array, which is the body of a HashMap, and a list, which is used to resolve hash conflicts
TreeMap: red-black tree (self-balanced sorted binary tree)