General introduction
HashSet and HashMap are explained together because they have the same implementation in Java, and the former is just a wrapper around the latter, meaning that a HashSet has a HashMap (adapter pattern) inside it. Therefore, this article will focus on HashMap.
Unlike TreeMap, the container does not guarantee the order of the elements. The container may rehash the elements as needed, and the order of the elements may be reshuffled. Therefore, the order of iterating over the same HashMap at different times may be different. Hash tables can be implemented in two ways, depending on how conflicts are handled. One is Open Addressing, and the other is Separate chaining with linked lists. The Java HashMap uses a collision linked list approach.
It is easy to see from the figure above that the put() and get() methods can be completed in constant time if the appropriate hash function is chosen. But when iterating over a HashMap, you need to traverse the entire table and any conflicting linked lists that follow. Therefore, it is not advisable to set the initial size of HashMap too large for scenarios with frequent iterations.
There are two parameters that affect the performance of a HashMap: inital capacity and load factor. The initial capacity specifies the initial table size, and the load factor specifies the threshold for automatic expansion. When the number of entries exceeds capacity*load_factor, the container automatically expands and hashes again. For scenarios with a lot of inserts, setting the initial capacity to be large can reduce the number of rehashes. There are two methods of special concern when putting objects into a HashMap or HashSet: hashCode() and equals(). The hashCode() method determines which bucket objects will be placed in, and when multiple objects have conflicting hash values, equals() determines whether they are “the same object.” So, if you want to put custom objects into a HashMap or HashSet, you need the @override hashCode() and equals() methods.
Methods analyze theget()The get(Object key) method returns the corresponding value based on the specified key value. This method calls getEntry(Object key) to get the corresponding entry, and then returns entry.getValue(). So getEntry() is the heart of the algorithm. The idea is to first hash() the index of the corresponding bucket, and then iterate through the conflicting list to determine whether it is the desired entry using key.equals(k).Hash (k)&(table.leng-1) is equivalent to hash(k)%table.length because HashMap requires table.length to be an exponent of 2. Hash (k) will wipe out all of the high hash value, and what’s left is the remainder.
//getEntry() final Entry<K,V> getEntry(Object key) {...... int hash = (key == null) ? 0 : hash(key); for (Entry<K,V> e = table[hash&(table.length-1)]; // Get conflicting linked list e! = null; E = e.next) {// Iterate through each entry in the conflicting list; / / on the basis of the equals () method to determine whether is equal to the if (e.h ash = = hash && ((k = e.k ey) = = key | | (key! = null && key.equals(k)))) return e; } return null; }Copy the code
put()The put(K key, V value) method adds the specified key and value pair to the map. This method first does a lookup on the map to see if it contains the tuple and returns it if it does, similar to the getEntry() method. If no entry is found, the addEntry(int hash, K key, V value, int bucketIndex) method is used to insert a new entry. The insertion method is header insertion.
//addEntry() void addEntry(int hash, K key, V value, int bucketIndex) { if ((size >= threshold) && (null ! = table[bucketIndex])) { resize(2 * table.length); // Automatically expand and rehash hash = (null! = key) ? hash(key) : 0; bucketIndex = hash & (table.length-1); // Hash %table.length} // Insert a new entry into the header of the conflicting linked list entry <K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); size++; }Copy the code
remove()The remove(Object key) function is to delete the entry corresponding to the key value. The logic of this method is implemented in removeEntryForKey(Object Key). The removeEntryForKey() method first finds the entry corresponding to the key value and then deletes it (modifying the corresponding pointer to the list). The search procedure is similar to the getEntry() procedure.
//removeEntryForKey() final Entry<K,V> removeEntryForKey(Object key) { ...... int hash = (key == null) ? 0 : hash(key); int i = indexFor(hash, table.length); //hash&(table.length-1) Entry<K,V> prev = table[i]; // Get conflicting linked list Entry<K,V> e = prev; while (e ! = null) {// Iterate through the conflicting list Entry<K,V> next = e.next; Object k; if (e.hash == hash && ((k = e.key) == key || (key ! = null && key.equals(k)))) {// find the entry modCount++ to delete; size--; if (prev == e) table[i] = next; Next = next; // Delete the first entry of the conflicting list. return e; } prev = e; e = next; } return e; }Copy the code
As mentioned earlier, a HashSet is a simple wrapper around a HashMap, and every function call to a HashSet is converted to an appropriate HashMap method, so the implementation of a HashSet is very simple, with less than 300 lines of code. I won’t repeat it here.
Public class HashSet<E> {...... private transient HashMap<E,Object> map; Dummy Value to associate with an Object in the backing Map private static final Object PRESENT = new Object(); public HashSet() { map = new HashMap<>(); }... Public Boolean add(E E) {return map.put(E, PRESENT)==null; }... }Copy the code
Source: www.cnblogs.com/CarpenterLe…