I. Overview of collection framework
1.1 the Collection
1.2 the Map
1.3 Summary of underlying data structure of collection framework
List
- ArrayList: Object[] array
- Vector: Object[] Array
- LinkedList: bidirectional LinkedList (previously circular LinkedList, removed loop in JDK1.7)
Set
- HashSet: Implemented based on HashMap, which is used at the bottom to store elements
- LinkedHashSet: LinkedHashSet is a subclass of HashSet and is internally implemented through LinkedHashMap. It’s a bit like LinkedHashMap, which is internally implemented based on HashMap, but with a slight difference
- TreeSet: red-black tree (self-balanced sorted binary tree)
Map
- HashMap: Before JDK1.8, a HashMap consisted of an array, the body of a HashMap, and a list, the main object of a HashMap, was to resolve hash collisions. JDK1.8 later in solving the hash conflict with the larger changes, when the chain length of the table is greater than the threshold value (the default is 8) (convert list into red and black tree before judgment, if the current is less than 64 the length of the array, so choose to array capacity, rather than into a red and black tree), the list into the red-black tree, in order to reduce the search time
- LinkedHashMap: LinkedHashMap inherits from HashMap, so the underlying LinkedHashMap is still based on a pulled hash structure consisting of arrays and linked lists or red-black trees. In addition, LinkedHashMap adds a two-way linked list to the above structure, allowing the above structure to maintain the insertion order of key-value pairs. At the same time, the related logic of access order is realized by the operation of linked list.
- Hashtable: An array, which is the body of a HashMap, and a list, which is used to resolve hash conflicts
- TreeMap: red-black tree (self-balanced sorted binary tree)
2. Common interview questions
2.1 Features of ArrayList, Vector, and LinkedList
- ArrayList and Vector are both arrays based on the fact that they create a contiguous chunk of memory for storage and therefore support fast random access. The time complexity for inserting and deleting elements is affected by the location of the element. When adding or removing elements at the end of the list, the time complexity is O(1); It’s O(n-i) if you want to insert and delete elements at position I.
- ArrayList and Vector both expand their lists dynamically. ArrayList expands its list by 1.5 times by default, and Vector expands its list by 2 times by default.
- The bottom layer of LinkedList is implemented based on bidirectional list, that is, LinkedList is used for storage. Therefore, the time complexity of element deletion is not affected by the location of the element and is approximately O(1). If it is to insert and delete elements at the specified location I, the time complexity is approximately O(n). Because you need to move to the specified position before you insert.
- ArrayList and LinkedList are asynchronous, that is, not thread-safe; Most methods in Vector are synchronized directly or indirectly and are therefore thread-safe.
2.2 Compare the similarities and differences of HashSet, LinkedHashSet and TreeSet
- HashSet is the main implementation class of the Set interface. The underlying class of HashSet is HashMap, which can store NULL values.
- LinkedHashSet is a subclass of HashSet and can be traversed in the order it was added;
- The underlying TreeSet uses a red-black tree to traverse the order in which elements are added, either naturally or by custom.
2.3 Differences between HashMap and Hashtable
- 1. Thread safety: HashMap is not thread safe, HashTable is thread safe.
- Efficiency: HashMap is a little more efficient than HashTable because of thread-safety issues.
- Support for Null keys and values: A HashMap can store Null keys and values, but there can only be one Null key and multiple Null values. A HashTable does not allow null keys or values; otherwise a NullPointerException will be thrown.
- 4. The difference between the initial capacity and each expansion capacity:
- (1) If you do not specify an initial capacity when creating a Hashtable, the default initial size of the Hashtable is 11. After each expansion, the original capacity is 2n+1. The default initialization size of a HashMap is 16. After each expansion, the capacity is doubled.
- (2) If you create a Hashtable with an initial capacity, then the Hashtable will use the given size, and the HashMap will expand it to a power of 2. That is, a HashMap always uses a power of 2 as the hash table size (to accommodate the implementation of the underlying data structure).
- 5. Underlying data structure: If the length of the list is greater than the threshold (default: 8), the list will be converted to a red-black tree. If the length of the current array is less than 64, the list will be expanded first, rather than converted to a red-black tree. To reduce search time. Hashtable has no such mechanism.
2.4 Differences between HashMap and TreeMap
- TreeMap and HashMap both inherit from AbstractMap, but note that TreeMap also implements NavigableMap and SortedMap interfaces.
- Implementing the NavigableMap interface gives TreeMap the ability to search for elements in a collection.
- Implementing the SortMap interface gives TreeMap the ability to sort elements in a collection by key. The default is to sort by key in ascending order, but we can also specify a comparator for sorting.
- To sum up, TreeMap’s ability to sort elements in a collection by key and search for elements in a collection is more important than HashMap’s.
2.5 How does a HashSet check for duplicates
When you add an object to a HashSet, the HashSet evaluates the object’s Hashcode value to determine where it was added. It also compares the object’s Hashcode value to other added objects. If there is no matching Hashcode, the HashSet assumes that the object is not repeated. But if objects with the same Hashcode value are found, the equals() method is called to check whether objects with hashCode equality are really the same. If they are the same, the HashSet will not let the join succeed.