Personal Blood Spitting series – Summary Java collection

Personal feeling master commonly used set class, see the source code, there are a lot of are actually about the same, the individual different source code to see, in fact, is to add and delete check

For example, the common ArrayList, LinkedList, HashMap, and ConcurrentHashMap are often asked to prepare more, this one is to look at the source code analysis, nothing else

ArrayList

An overview of the

  • ArrayListTo achieve theListInterfaces are sequential containers that allow elements to be placed in the same order as they were placed innullElement, the bottom layer passes throughArray implementation.
  • This class is similar to a Vector except that it is not synchronized.
  • Each ArrayList has a capacity, which represents the actual size of the underlying array. The number of elements stored in the container cannot exceed the current capacity.
  • When you add elements to a container, if there is not enough capacity, the container automatically increases the size of the underlying array.
  • As mentioned earlier, Java generics are just syntactic sugar provided by the compiler, so the array here is an Object array in order to be able to hold any type of Object.

  • The size(), isEmpty(), get() and set() methods can all be completed in constant time. The time cost of add() method is related to the insertion position, and the time cost of addAll() method is proportional to the number of added elements. The rest are mostly linear time.
  • For efficiency, ArrayList does not implement synchronized. If multiple threads need to access it concurrently, users can either manually synchronize or use Vector instead.

implementation

Underlying data structure

transient Object[] elementData; / / the Object array
private int size; / / size
Copy the code

The constructor

		// The parameter is the capacity construction parameter
		public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA; / / the default
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}// A construction parameter with no parameters
    public ArrayList(a) {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA; // Default capacity
    }

    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if((size = elementData.length) ! =0) {
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class); }else {
            // replace with empty array.
            this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

Automatic expansion

    public void ensureCapacity(int minCapacity) {
        intminExpand = (elementData ! = DEFAULTCAPACITY_EMPTY_ELEMENTDATA)// any size if not default element table
            ? 0
            // larger than default for default empty table. It's already
            // supposed to be at default size.
            : DEFAULT_CAPACITY;

        if(minCapacity > minExpand) { ensureExplicitCapacity(minCapacity); }}private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

		private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
Copy the code
  • Whenever an element is added to an array, it is checked to see if the number of elements will exceed the current array size. If so, the array will be expanded to accommodate additional data.
  • Array expansion through an exposed methodensureCapacity(int minCapacity)To implement. I can also use ensureCapacity to manually increase the capacity of an ArrayList instance to reduce the amount of incremental reallocation before actually adding a large number of elements.
  • When an array is expanded, the elements of the old array are copied into the new array. Each time the array grows by about 1.5 times its original capacity.
  • This operation is expensive, so in practice, we should try to avoid expanding the array size.
  • When we know how many elements we want to store, we specify the capacity when constructing an ArrayList instance to avoid array expansion.
  • Or manually increase the capacity of the ArrayList instance by calling the ensureCapacity method, depending on actual needs.

add()

Add new elements to the container, which may cause insufficient capacity. Therefore, check the remaining space before adding elements and expand the capacity if necessary. The expansion operation is ultimately done with the grow() method.

    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!! // Multithreading can cause problems
        elementData[size++] = e; // Here too
        return true;
    }

    public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }
Copy the code

get()

The get() method is also simple, the only thing to note is that since the underlying array is Object[], you need to cast the element.

public E get(int index) {
    rangeCheck(index);
    return (E) elementData[index];// Note the type conversion
}
Copy the code

remove()

The remove() method also has two versions, one is remove(int index) removes the element at the specified position, and the other is remove(Object O) removes the first element that satisfies O.squares (elementData[index]). The delete operation is the reverse of the add() operation, requiring the element after the delete point to be moved one position forward. Note that in order for GC to work, null must be explicitly assigned to the last position.

public E remove(int index) {
    rangeCheck(index);
    modCount++;
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    elementData[--size] = null; // Clear references to that location and let GC work
    return oldValue;
}
Copy the code

indexOf()

Loop through using equals

    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return -1;
    }
Copy the code

The mechanism of Fail – Fast

ArrayList also uses a fail-fast mechanism by logging modCount parameters. In the face of concurrent changes, iterators will soon fail completely, rather than risk arbitrary uncertain behavior at some unspecified time in the future.

Multithreading problem

Please refer to -> Personal Hematemesis series – summary of Java multithreading

LinkedList

An overview of the

LinkedList implements both the List and Deque interfaces, meaning that it can be viewed as an order container, a Queue, or a Stack. In this way, LinkedList is an all-around champion. Consider using LinkedList when you need to use stacks or queues, partly because The Stack class is officially deprecated and, unfortunately, there is no class in Java called Queue (which is an interface name). For stacks or queues, the current preferred is ArrayDeque, which has better performance than LinkedList (when used as a stack or queue).

The way LinkedList is implemented dictates that all subscript related operations are linear time, while deleting elements at the beginning or end takes only constant time. In the pursuit of efficiency LinkedList no synchronization (synchronized), if you need multiple threads concurrent access, can first use the Collections. SynchronizedList () method on the packaging.

implementation

Underlying data interface

    transient int size = 0;
    transient Node<E> first; // Use it frequently
    transient Node<E> last; // Also used frequently
		Node is a private inner class
    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev; }}Copy the code

The underlying LinkedList is implemented through a bidirectional LinkedList. This section will focus on the maintenance process of the bidirectional LinkedList when inserting and deleting elements, that is, the functions related to the List interface. Each Node of a bidirectional linked list is represented by an inner class Node. LinkedList refers to the first and last element of the list via first and last references, respectively. Note that there is no dummy element here, where first and last point to NULL when the list is empty.

The constructor

    public LinkedList(a) {}public LinkedList(Collection<? extends E> c) {
        this(a); addAll(c); }Copy the code

GetFirst (), getLast ()

It maintains the first and last variables in the data structure itself, so it’s pretty simple.

    public E getFirst(a) {
        final Node<E> f = first; // Get the first element
        if (f == null)
            throw new NoSuchElementException();
        return f.item;
    }

    public E getLast(a) {
        final Node<E> l = last; // Get the last element
        if (l == null)
            throw new NoSuchElementException();
        return l.item;
    }
Copy the code

RemoveFirst (), removeLast(), remove(e), remove(index)

Delete element – Removes the first occurrence of the element, or returns false if there is none; Unlink the node to equals (); Because LinkedList can hold null elements, you can delete the first null element.

    public boolean remove(Object o) {
        if (o == null) {
            for(Node<E> x = first; x ! =null; x = x.next) {
                if (x.item == null) {
                    unlink(x);
                    return true; }}}else {
            for(Node<E> x = first; x ! =null; x = x.next) {
                if (o.equals(x.item)) { // Loop through using equals
                    unlink(x);
                    return true; }}}return false;
    }
        E unlink(Node<E> x) {
        // assert x ! = null;
        final E element = x.item; // Current element
        final Node<E> next = x.next; // point to the next node
        final Node<E> prev = x.prev; // Last node

        if (prev == null) {// The first element, if the upper node of the node is empty, then the next node of the node is placed in the first place
            first = next;
        } else {
            prev.next = next; // If it is not empty, the last node points to the next node of that node
            x.prev = null;
        }

        if (next == null) {// The last element
            last = prev;
        } else {
            next.prev = prev;
            x.next = null;
        }

        x.item = null; // GC
        size--;
        modCount++;
        return element;
    }
Copy the code

Remove (int index) uses the index count, only need to check whether the index has elements, if so, directly unlink the node.

    public E remove(int index) {
        checkElementIndex(index);
        return unlink(node(index));
    }
Copy the code

RemoveFirst () is actually pretty simple

    public E removeFirst(a) {
        final Node<E> f = first; // Get firS direct unlink
        if (f == null)
            throw new NoSuchElementException();
        return unlinkFirst(f);
    }

    private E unlinkFirst(Node<E> f) {
        // assert f == first && f ! = null;
        final E element = f.item; // first e
        final Node<E> next = f.next; // First has no Pre, only next
        f.item = null;
        f.next = null; // help GC
        first = next; // Make first point to next
        if (next == null) // If next is empty, the current element is the last, and last is empty
            last = null;
        else
            next.prev = null; // If not empty, next's last pointer is empty
        size--;
        modCount++;
        return element;
    }
Copy the code

RemovLast () is actually quite simple, similar to the above

    public E removeLast(a) {
        final Node<E> l = last;
        if (l == null)
            throw new NoSuchElementException();
        return unlinkLast(l);
    }

    private E unlinkLast(Node<E> l) {
        // assert l == last && l ! = null;
        final E element = l.item;
        final Node<E> prev = l.prev;
        l.item = null;
        l.prev = null; // help GC
        last = prev;
        if (prev == null)
            first = null;
        else
            prev.next = null;
        size--;
        modCount++;
        return element;
    }
Copy the code

add()

    public boolean add(E e) {
        linkLast(e); // Insert elements at the end of the list, so constant time
        return true;
    }

    void linkLast(E e) { // Modify the reference at the end
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }
Copy the code

Add (int index, E element); if index==size, add(E E); If not, there are two steps: 1. Find the position to insert according to index (node(index) method). 2. Modify the reference to complete the insert operation.

indexOf()

Loop over equals to find the corresponding subscript

    public int indexOf(Object o) {
        int index = 0; / / maintenance index
        if (o == null) {
            for(Node<E> x = first; x ! =null; x = x.next) {
                if (x.item == null)
                    returnindex; index++; }}else {
            for(Node<E> x = first; x ! =null; x = x.next) {
                if (o.equals(x.item)) / / use the equals
                    returnindex; index++; }}return -1;
    }
Copy the code

HashMap(Common Interview Question)

It is well known that the underlying structure of a HashMap is made up of arrays and linked lists, although the details are slightly different in jdk1.7 and jdk1.8.

1.7 the implementation of the

Member variables

Here, will not stick 1.7 version of the source code, so stickers.

Introducing member variables:

  1. Initialize the bucket size, which is the default size of the array because the underlying is an array.
  2. Maximum number of buckets.
  3. Default load factor (0.75)
  4. Table is the array that actually holds data.
  5. Size of the map storage quantity
  6. Bucket size, which can be specified explicitly in the constructor.
  7. Load factor, which can be specified explicitly at constructor time.

The load factor

public HashMap(a) {
    this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR); // Buckets and load factors
}
public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY; // Only the default maximum value can be obtained
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " +
                                           loadFactor);
    this.loadFactor = loadFactor;
    threshold = initialCapacity;
    in
Copy the code
  • Given a default capacity of 16 with a load factor of 0.75.
  • Map continuously stores data in it during use, when the amount is reached16 * 0.75 = 12The current capacity of 16 needs to be expanded, and the expansion process involvesrehash(rehash), copying data, etc., all very performance intensive.
  • Therefore, it is generally recommended to estimate the size of the HashMap in advance to minimize the additional performance loss caused by capacity expansion.
  • On this part of the latter part of a special article to explain.

Entry

Entry is an inner class in a Hashmap, as can be easily seen from its member variables:

  • Key is the key to write to
  • Value is value
  • From the beginning, we mentioned that HashMap is made up of arrays and linked lists, so this next is used to implement the linked list structure
  • Hash stores the hashcode of the current key

Put (Here’s the point)

public V put(K key, V value) {
    if (table == EMPTY_TABLE) {
        inflateTable(threshold); // Determine whether the array needs to be initialized
    }
    if (key == null)
        return putForNullKey(value); // Check whether key is empty
    int hash = hash(key); / / hashcode calculation
    int i = indexFor(hash, table.length); / / barrel
    for(Entry<K,V> e = table[i]; e ! =null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { // Check whether the key and hashcode in the list are equal
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }
    modCount++;
    addEntry(hash, key, value, i); // Add a new one
    return null;
}
Copy the code
  • Determines whether the current array needs to be initialized
  • If key is null, put a null value into it
  • Calculates the Hashcode based on the key
  • Locate the bucket of index based on the calculated Hashcode
  • If the bucket is a linked list, you need to iterate to determine whether the hashcode and key in the bucket are equal to the passed key. If they are equal, you overwrite them and return the original value
  • If the bucket is empty, no data is stored in the current location. In this case, an Entry object is added to the current location.
void addEntry(int hash, K key, V value, int bucketIndex) {
    if ((size >= threshold) && (null! = table[bucketIndex])) {// Whether to expand the capacity
        resize(2 * table.length); // Double the capacity to rehash
        hash = (null! = key) ? hash(key) :0;
        bucketIndex = indexFor(hash, table.length);
    }
    createEntry(hash, key, value, bucketIndex);
}
void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];
    table[bucketIndex] = new Entry<>(hash, key, value, e);
    size++;
}
Copy the code
  • When calling addEntry to write Entry, you need to determine whether to expand the capacity
  • Double the size if necessary, and rehash the current key and position it.
  • CreateEntry passes the bucket with the current location to the new bucket, and if the bucket has a value, it forms a linked list of locations.

get

public V get(Object key) {
    if (key == null) // Check whether key is empty
        return getForNullKey(); // If it is empty, it returns a null value
    Entry<K,V> entry = getEntry(key); // get entry
    return null == entry ? null : entry.getValue();
}
final Entry<K,V> getEntry(Object key) {
    if (size == 0) {
        return null;
    }
    int hash = (key == null)?0 : hash(key); // Based on key and hashcode
    for (Entry<K,V> e = table[indexFor(hash, table.length)]; // Loop over equals key to get the valuee ! =null;
         e = e.next) {
        Object k;
        if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
            return e;
    }
    return null;
}
Copy the code
  • First compute the HashCode based on the key, and then locate the specific bucket
  • Determines whether the location is a linked list
  • Return values based on whether the key and hashcode are equal or not if they are not linked
  • For a linked list, you need to iterate until the key and hashcode are equal and return the value
  • Nothing, return null

1.8 the implementation of the

Do you see any points that need to be optimized?

One obvious place to look is linked lists

When Hash conflicts are serious, the linked list formed on the bucket becomes longer and longer, resulting in lower query efficiency. Time complexity isO(N).

Member variables

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
/** * The maximum capacity, used if a higher value is implicitly specified * by either of the constructors with arguments. * MUST be a power of two < = 1 < < 30. * /
static final int MAXIMUM_CAPACITY = 1 << 30;
/** * The load factor used when none specified in constructor. */
static final float DEFAULT_LOAD_FACTOR = 0.75 f;
static final int TREEIFY_THRESHOLD = 8;
transient Node<K,V>[] table;
/** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). */
transient Set<Map.Entry<K,V>> entrySet;
/** * The number of key-value mappings contained in this map. */
transient int size;
Copy the code
  • TREEIFY_THRESHOLDA threshold used to determine whether a linked list needs to be converted to a red-black tree.
  • Change HashEntry to Node.
  • The core component of Node is the same as that of HashEntry in 1.7key value hashcode nextSuch data.

put

    /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0) // 1. Check whether the current bucket is empty. If it is empty, initialize it.
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null) // 2. Locate the bucket according to the hashcode of the current key and check whether the bucket is empty. If the bucket is empty, there is no Hash conflict and create a new bucket at the current location
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash && // 3. If the current bucket has a value (Hash conflict), the key in the current bucket is compared, and the hashcode of the key is equal to the written key. If the key is equal, the value is assigned to e((k = p.key) == key || (key ! =null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode) // 4. If the current bucket is a red-black tree, write data in red-black tree mode
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else { // 5. If it is a linked list, write the current key and value as a new node to the end of the bucket to form a linked list.
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st // 6. Then determine whether the size of the current list is greater than the preset threshold, and convert it to a red-black tree
-
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash && If the same key is found during the traversal, exit the traversal directly.((k = e.key) == key || (key ! =null && key.equals(k))))
                        break; p = e; }}if(e ! =null) { // existing mapping for key 8. If ` e! = null 'means that the same key exists, and the value needs to be overwritten.
                V oldValue = e.value;
                if(! onlyIfAbsent || oldValue ==null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold) // 9. Determine whether capacity expansion is required.
            resize();
        afterNodeInsertion(evict);
        return null;
    }
Copy the code
  • If the bucket is empty, initialize it.
  • Locate the bucket based on the hashcode of the current key and check whether it is empty. If it is empty, it indicates that there is no Hash conflict and creates a new bucket directly at the current location
  • If the current bucket has a value (Hash conflict), the key in the current bucket is compared, and the hashcode of the key is equal to the written key. If the key is equal, the value is assigned to E. At step 8, the assignment and return are performed uniformly
  • If the current bucket is a red-black tree, write data in red-black tree fashion
  • If it is a linked list, you need to encapsulate the current key and value into a new node at the end of the bucket to form a linked list.
  • Then determine whether the size of the current list is greater than the preset threshold and convert it to a red-black tree
  • If the same key is found during the traversal, exit the traversal directly.
  • ife ! = nullIt’s equivalent to having the same key, so you need to override the value.
  • Determine whether capacity expansion is required.

get

public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if((tab = table) ! =null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) ! =null) {
        if (first.hash == hash && // always check first node((k = first.key) == key || (key ! =null && key.equals(k))))
            return first;
        if((e = first.next) ! =null) {
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
                if(e.hash == hash && ((k = e.key) == key || (key ! =null && key.equals(k))))
                    return e;
            } while((e = e.next) ! =null); }}return null;
}
Copy the code
  • The key is first hash and the bucket is retrieved
  • If the bucket is empty, null is returned
  • Otherwise, it checks whether the key in the first position of the bucket (such as a linked list or red-black tree) is the query key and returns value directly
  • If the first one does not match, it is determined whether the next one is a red-black tree or a linked list
  • The red-black tree returns the value as the tree looks up
  • Otherwise, the match returns are iterated through in a linked list

From these two core methods (get/put), we can see that the large list has been optimized in 1.8, and the query efficiency has been directly improved after the modification to red black treeO(logn).

The problem

However, the original problems of HashMap also exist, such as the tendency for an infinite loop when used in concurrent scenarios.

final HashMap<String, String> map = new HashMap<String, String>();
for (int i = 0; i < 1000; i++) {
    new Thread(new Runnable() {
        @Override
        public void run(a) {
            map.put(UUID.randomUUID().toString(), "");
        }
    }).start();
}
Copy the code
  • The resize() method is called when a HashMap is expanded, where concurrent operations easily form a circular list on a bucket
  • So when you get a key that doesn’t exist, and the index is the index of the linked list, there will be an infinite loop.
  • The problem with 1.7’s header insertion method is solved by 1.8 changing the insertion order, but ConCurrentHashMap is still needed for security, such as memory visibility

Reference: hashMap dead-loop analysis

HashMap infinite loop analysis

It is also worth noting that HashMap traversal is usually done in one of the following ways:

Iterator<Map.Entry<String, Integer>> entryIterator = map.entrySet().iterator();
        while (entryIterator.hasNext()) {
            Map.Entry<String, Integer> next = entryIterator.next();
            System.out.println("key=" + next.getKey() + " value=" + next.getValue());
        }

Iterator<String> iterator = map.keySet().iterator();
        while (iterator.hasNext()){
            String key = iterator.next();
            System.out.println("key=" + key + " value=" + map.get(key));
        }
Copy the code
  • You are advised to use the first method and remove the key value at the same time.
  • The second method also needs to fetch the key once through the key, which is inefficient.

ConcurrentHashMap

1.7

  • Segment array
  • HashEntry composed
  • Just like a HashMap, it’s an array plus a linked list
/** * Segment array; * /
final Segment<K,V>[] segments;
transient Set<K> keySet;
transient Set<Map.Entry<K,V>> entrySet;
Copy the code

The Segment is an internal class of ConcurrentHashMap.

static final class Segment<K.V> extends ReentrantLock implements Serializable {
       private static final long serialVersionUID = 2249069246763182397L;

       // Just like HashEntry in HashMap, it is a bucket that actually holds data
       transient volatile HashEntry<K,V>[] table;
       transient int count;
       transient int modCount;
       transient int threshold;
       final float loadFactor;

}
Copy the code

  • The only difference is that the core data such as value and linked lists are bothvolatileModified to ensure visibility when fetching.
  • ConcurrentHashMap adoptedSegmented lockTechnology, where Segment inherits fromReentrantLock.
  • ConcurrentHashMap supports concurrent CurrencyLevel (number of segments).
  • Every time a thread holds a lock on one Segment, it does not affect the other segments.

put

public V put(K key, V value) {
    Segment<K,V> s;
    if (value == null)
        throw new NullPointerException();
    int hash = hash(key);
    int j = (hash >>> segmentShift) & segmentMask;
    if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
         (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment
        s = ensureSegment(j);
    return s.put(key, hash, value, false);
}
Copy the code

Locate the Segment by key, and then perform a specific PUT in the corresponding Segment

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
    HashEntry<K,V> node = tryLock() ? null :
        scanAndLockForPut(key, hash, value); // 1
    V oldValue;
    try {
        HashEntry<K,V>[] tab = table;
        int index = (tab.length - 1) & hash;
        HashEntry<K,V> first = entryAt(tab, index);
        for (HashEntry<K,V> e = first;;) {
            if(e ! =null) {
                K k;
                if ((k = e.key) == key || // 2. Iterate over the HashEntry. If not empty, determine whether the passed key is equal to the current one, and overwrite the old value
                    (e.hash == hash && key.equals(k))) {
                    oldValue = e.value;
                    if(! onlyIfAbsent) { e.value = value; ++modCount; }break;
                }
                e = e.next;
            }
            else { // 3. If it is not empty, create a HashEntry and add it to the Segment
                if(node ! =null)
                    node.setNext(first);
                else
                    node = new HashEntry<K,V>(hash, key, value, first);
                int c = count + 1;
                if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                    rehash(node);
                else
                    setEntryAt(tab, index, node);
                ++modCount;
                count = c;
                oldValue = null;
                break; }}}finally {
        unlock(); 4 / / unlock
    }
    return oldValue;
}
Copy the code
  • Although the value in HashEntry is volatile, atomicity of concurrency is not guaranteed, so the PUT operation still needs to be locked.

  • The first step is to try to acquire the lock. If the lock fails, it must be contested by other threads. Use scanAndLockForPut() to acquire the lock.

    1. Try to get a spin lock

    2. If the maximum number of retries reaches MAX_SCAN_RETRIES, the lock is blocked to ensure success.

In summary:

  • Locate the table in the current Segment to a HashEntry using the hashcode of the key
  • The HashEntry is iterated over, and if it is not empty it determines whether the passed key is equal to the currently iterated key, and if it is, the old value is overwritten
  • If it is not empty, create a HashEntry and add it to the Segment. At the same time, it determines whether to expand the Segment
  • The lock on the current Segment obtained in 1 is finally unlocked.

get

public V get(Object key) {
    Segment<K,V> s; // manually integrate access methods to reduce overhead
    HashEntry<K,V>[] tab;
    int h = hash(key);
    long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
    if((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) ! =null&& (tab = s.table) ! =null) {
        for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                 (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE); e ! =null; e = e.next) {
            K k;
            if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                returne.value; }}return null;
}
Copy the code
  • Just Hash the Key to the Segment and Hash it to the element.
  • Because the value attribute in HashEntry is decorated with a volatile keyword to ensure memory visibility, it is retrieved with the latest value each time.
  • The Get method of ConcurrentHashMap is very efficient because the entire process does not require locking.

1.8

Query traversal is inefficient.

Which abandoned the original Segment Segment lock, and usedCAS + synchronizedTo ensure concurrency security

put

    final V putVal(K key, V value, boolean onlyIfAbsent) {
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
        for (Node<K,V>[] tab = table;;) { // 1. Calculate hashCode based on key
            Node<K,V> f; int n, i, fh;
            if (tab == null || (n = tab.length) == 0) // 2. Determine whether initialization is required
                tab = initTable();
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) { // 3. f is the Node located by the current key. If it is empty, data can be written to the current position. If CAS is used to attempt to write data, the spin ensures success.
                if (casTabAt(tab, i, null.new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
            else if ((fh = f.hash) == MOVED) // 4. If 'hashCode == MOVED == -1' is in the current location, it needs to be expanded
                tab = helpTransfer(tab, f);
            else {
                V oldVal = null;
                synchronized (f) { // 5. If none of the requirements are met, write data using synchronized lock
                    if (tabAt(tab, i) == f) {
                        if (fh >= 0) {
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
                                K ek;
                                if(e.hash == hash && ((ek = e.key) == key || (ek ! =null && key.equals(ek)))) {
                                    oldVal = e.val;
                                    if(! onlyIfAbsent) e.val = value;break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break; }}}else if (f instanceof TreeBin) {
                            Node<K,V> p;
                            binCount = 2;
                            if((p = ((TreeBin<K,V>)f).putTreeVal(hash, key, value)) ! =null) {
                                oldVal = p.val;
                                if(! onlyIfAbsent) p.val = value; }}}}if(binCount ! =0) {
                    if (binCount >= TREEIFY_THRESHOLD) // 6. If the number is greater than 'TREEIFY_THRESHOLD', the tree is converted to red-black.
                        treeifyBin(tab, i);
                    if(oldVal ! =null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount);
        return null;
    }
Copy the code
  • Calculate the Hashcode based on the key
  • Check whether initialization is required
  • F is the Node located by the current key. If f is empty, data can be written to the current position. CAS is used to attempt to write data.
  • If the current position ofhashcode == MOVED == -1, you need to expand the capacity
  • If none is met, data is written using a synchronized lock
  • If the quantity is greater thanTREEIFY_THRESHOLDIs converted toRed and black tree.

get

    public V get(Object key) {
        Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
        int h = spread(key.hashCode());
        if((tab = table) ! =null && (n = tab.length) > 0 &&
            (e = tabAt(tab, (n - 1) & h)) ! =null) {
            if ((eh = e.hash) == h) {
                if((ek = e.key) == key || (ek ! =null && key.equals(ek)))
                    return e.val;
            }
            else if (eh < 0)
                return(p = e.find(h, key)) ! =null ? p.val : null;
            while((e = e.next) ! =null) {
                if(e.hash == h && ((ek = e.key) == key || (ek ! =null && key.equals(ek))))
                    returne.val; }}return null;
    }
Copy the code
  • Based on the calculated Hashcode addressing, the value is returned directly if it is on the bucket.

  • If it’s a red-black tree you get the value as a tree.

  • If you don’t, you just iterate through the list to get the value.

    1.8 has made major changes in the data structure of 1.7, adopting red-black tree to ensure query efficiency (O(logn)), and even cancelled ReentrantLock and changed to synchronized. You can see that synchronized is well optimized in newer versions of the JDK.

conclusion

Format:

  • Talk about what you understand about hashmaps, and talk about the get put process.
  • What optimizations have been made in 1.8?
  • Is it thread safe?
  • What are the problems caused by insecurity?
  • How to solve it? Is there a thread-safe concurrency container?
  • How is ConcurrentHashMap implemented? 1.7. What are the differences in 1.8 implementations? Why do you do that?

Single article length is longer, not suitable for mobile reading, reading is not suitable for eat, not suitable for reading on the toilet, is suitable for spot check yourself in a particular time period, suitable for a certain period of time will be dispersed in series in the collection system, a collection of knowledge points again and again to read and browse, constantly have new feelings, may be this is called: consider

reference

  • Making the address
  • Tc. Dreamcat. Ink/archives / 23…