What is the underlying data structure of HashMap?
In 1.7, the bottom layer of HashMap is hash array and one-way linked list. After jdk1.8, the data structure of array + linked list + red-black tree is adopted.
2. How HashMap works
Put and GET store and get objects. When we pass the key and value to the put() method, we first do a hashCode() calculation of the key to get its position in the bucket array to store the Entry object. When an object is fetched, the bucket location is retrieved by get, the correct key-value pair is found through the equals() method of the key object, and the value object is returned.
3. When using HashMap, what if two objects have the same hashCode?
Since HashCode is the same, not necessarily equal (equals method comparison), the array subscripts of the two objects are the same, and a “collision” occurs. Since HashMap uses a linked list to store objects, this Node will be stored in the list.
4. How to design the hash function of HashMap
The hash function takes the hashCode that passes the key, which is a 32-bit int, and xor the high and low 16 bits of the hashCode. Two benefits: 1. Make sure to minimize hash collisions and scatter them as much as possible; 2, the algorithm must be as efficient as possible, because this is high-frequency operation, so use bit operation;
5. How many HashMap traversal methods are there?
1. The most common way to use an Iterator is to get both a key and a value
2, use foreach (JDK1.8 only)
3. Iterate through the set set for key
6. How many ways can hash conflicts be resolved?
1. Hash again: If the index has already been hash, hash again until an empty index is found. This is the easiest way to think about it. However, there are two disadvantages:
It wastes space and consumes efficiency. The root cause is that the length of the array is fixed, so constantly hashing to find an empty index might overstep the bounds, creating a new array and migrating the data from the old array. As the array gets bigger and bigger, the cost is not trivial.
Can’t get, or the GET algorithm is complicated. In, in, out, not so easy.
2. Open address method: If the hash index already has a value, the algorithm searches for space in several positions before or after it, which is not very different from the rehash algorithm. 3. Create a public overflow zone: Put the conflicting hash values into another overflow zone. 4. Chained address method: Stores hash values that generate hash conflicts in the index position in a linked list. HashMap uses this method. The advantages are that no new space is needed, data is not lost, and addressing is relatively simple. But as the hash chain grows longer, addressing becomes more time-consuming. A good hash algorithm is to keep the chain as short as possible, ideally with only one value on an index. The idea is to keep the hash addresses as evenly distributed as possible and as simple as possible.
7. Describe the put method in HashMap
8. Why convert a list to a red-black tree when the list length is greater than or equal to 8?
Because the average search length of a red-black tree is log(n), the average search length of a red-black tree is 3 when the length is 8, and the average search length of a linked list is 8/2=4 if you continue using the list, it is necessary to convert the list to a red-black tree when the list length is greater than =8.
9. Talk about the process of resize expansion
How is get implemented in hashMap?
Hash the hashCode of the key, and calculate the index of the bucket. If the bucket can be found at the top of the bucket, then return the bucket. Otherwise, search the tree or the linked list.
11, zipper method caused by the linked list is too deep problem why not replace the binary search tree, and choose red black tree? Why not always use red black trees?
The red-black tree was chosen to solve the problem of binary lookup trees, which in special cases can become a linear structure (as with the original linked list structure, causing deep problems), and traversal lookup can be very slow. And red and black tree after inserting new data may need to be by a left-handed, right-handed, color changing these operations to maintain a balance, the introduction of a red-black tree is to find data quickly, solve the problems of the chain table query depth, we know that the red-black tree to balanced binary tree, but in order to maintain the “balance” is the need to pay the price, but the price is the depletion of the resources less than traverse linear list, So when the length is greater than 8, we use a red-black tree, but if the list length is very short, we don’t need to introduce a red-black tree at all, it will be slow.
Red black tree
R-b Tree, full name is red-black Tree, also known as “Red Black Tree”, it is a special binary search Tree. Red-black trees have memory bits on each node to indicate the color of the node, which can be Red or Black.
Red-black tree features: (1) Each node is either black or red. (2) The root node is black. (3) Each leaf node (NIL) is black. [Note: leaf nodes are only empty leaf nodes!] (4) If a node is red, its children must be black. (5) All paths from a node to its descendants contain the same number of black nodes.
Here is a simple red-black tree.
What changes have been made to HashMap in JDK8?
1. In Java 1.8, if the length of a list exceeds 8, the list is converted to a red-black tree. 2. When a hash collision occurs, Java 1.7 inserts 3 at the head of the list, while Java 1.8 inserts 3 at the end of the list. In Java 1.8, Entry is replaced by Node (with a vest).
Can we use any class as a key in HashMap?
If we want to use a custom class as the key of a HashMap, we need to note the following:
If a class overrides equals, it should also override hashCode.
All instances of the class need to follow rules related to equals and hashCode.
If a class doesn’t use UALS, you shouldn’t use it in hashCode.
Our best practice for customizing the key class is to make it immutable so that the hashCode value can be cached for better performance. Immutable classes also ensure that hashCode and equals will not change in the future, which will solve the mutable problem.
What is the difference between HashMap, LinkedHashMap, and TreeMap?
LinkedHashMap is derived from HashMap and is based on HashMap and bidirectional linked lists.
A HashMap disorder; The LinkedHashMap is ordered and can be divided into insert order and access order. In access order, both put and GET operations on existing entries move the Entry to the end of the bidirectionally linked list.
LinkedHashMap accesses data in the same way that HashMap uses Entry[], two-way lists only for order.
LinkedHashMap is thread unsafe.
TreeMap implements the SortMap interface, which sorts the records it saves by key. The default is an ascending sort of key values. You can also specify the sort comparator so that when Iterator traverses a TreeMap, the records are sorted.
16. What is fail-fast?
The fail-fast mechanism is a bug mechanism in Java collections. Fail-fast events can occur when multiple threads operate on the contents of the same collection. For example, when A thread A by the iterator to traverse A collection in the process, if the content of the collection was changed by another thread, then thread A access to the collection, will throw ConcurrentModificationException, produce fail – fast. The operations here mainly refer to add, remove and clear to modify the number of elements in the collection. The workaround suggests that “classes under the java.util package” be replaced with “classes under the java.util package”. It can be understood as follows: ExpectModCount = expectModCount = expectModCount; expectModCount = expectModCount; So throw ConcurrentModificationException.
17. Is HashMap thread safe?
For example, when thread A decides that the index position is empty, it just suspends. Thread B starts to write node data to the index position. Then thread A resumes the scene and performs assignment operation. The data of thread A is overwritten; There is also the ++size this place will also cause multiple threads at the same time expansion and other problems.
How to circumvent it: Use thread-safe ConcurrentHashMap.
18. What is the difference between HashMap and ConcurrentHashMap?
One thread is not safe, one thread is safe.
ConcurrentHashMap before JDK 1.8 uses Segment locking to implement Segment + HashEntry. The default size of Segment array is 16,2 ^ n. After JDK1.8, Node + CAS + Synchronized is adopted to ensure concurrency security, and the lock granularity is reduced.
19. Talk about the locking mechanism in ConcurrentHashMap
In JDK 1.7, the mechanism of Segment locking is adopted to realize concurrent update operations. The underlying storage structure of array + linked list is adopted, including two core static internal classes Segment and HashEntry. The Segment inherits a ReentrantLock that acts as a lock. Each Segment object guards buckets in each hash map. (2) HashEntry is used to encapsulate key-value pairs of the mapping table; ③ Each bucket is a linked list of HashEntry objects
In JDK 1.8, Node + CAS + Synchronized is used to ensure concurrency security. Delete class Segment, use table array to store key-value pairs. When the length of a HashEntry list exceeds TREEIFY_THRESHOLD, the list is converted to a red-black tree to improve performance. The bottom layer changes to array + linked list + red-black tree.