CopyOnWriteList implementation details

I summarized the source details of ArrayList in Java collections earlier. I also mentioned that ArrayList is thread unsafe. So what is the thread safe List that the JDK provides for us? It mainly uses the copy-on-write idea, this idea is roughly read and write separation: read share, write copy (the original array) update (and exclusive lock), and we below analysis of the source code specific implementation is also the embodiment of this idea

An overview of the

CopyOnWriteList implements Serializable, Cloneable and RandomAccess interfaces. It has the characteristics of RandomAccess. It implements List interface and has the characteristics of List.

Let’s take a look at the main properties of CopyOnWriteList separately and see which of the methods we’ll analyze next. As can be seen from the figure:

Each CopyOnWriteList object has an array array in it to hold the specific elements
Use ReentrantLock exclusive to ensure that only the writer thread updates the array copy. For ReentrantLock, please refer to my other AQS application ReentrantLock.
CopyOnWriteArrayList in traverse using won’t throw ConcurrentModificationException, and you don’t have to traverse the extra lock

Again, let’s focus on the implementation of CopyOnWriteList

Member attribute

/ / the time of this update is to ensure that the array is only one thread to acquire the lock, and then update
final transient ReentrantLock lock = new ReentrantLock();
// Use volatile arrays to ensure that the updated array is visible to other threads after the writer thread updates it.
// However, there is no guarantee of timeliness: after adding an element to a copy of an array, other readers see the old array until the array is updated to point to the new address
private transient volatile Object[] array;
// Get an array, non-private, final
final Object[] getArray() {
    return array;
}
// Set the array
final void setArray(Object[] a) {
    array = a;
}
Copy the code

A constructor

(1) No arguments. By default, an array of length 0 is created

// This is the constructor that creates a new array of objects with length 0
// Then call the setArray method to set it to the member variable array of CopyOnWriteList
public CopyOnWriteArrayList(a) {
    setArray(new Object[0]);
}
Copy the code

(2) The constructor of Collection

// Create a list of the elements of the passed collection by iterating through the iterators of the collection in the order returned
// If null is passed, an exception is thrown
public CopyOnWriteArrayList(Collection<? extends E> c) {
    Object[] elements; // An array of elements
    // This is to determine whether a CopyOnWriteArrayList collection is passed
    if (c.getClass() == CopyOnWriteArrayList.class)
        // If so, call the getArray method directly, get the array passed to the collection and assign it to elementselements = ((CopyOnWriteArrayList<? >)c).getArray();else {
        // First convert the passed collection into an array
        elements = c.toArray();
        // c.array () might not correctly return an Object[] array, so use the arrays.copyof () method
        if(elements.getClass() ! = Object[].class) elements = Arrays.copyOf(elements, elements.length, Object[].class); }// Call the setArray method directly to set the array properties
    setArray(elements);
}
Copy the code

(3) Create a list containing a copy of the given array

public CopyOnWriteArrayList(E[] toCopyIn) {
    setArray(Arrays.copyOf(toCopyIn, toCopyIn.length, Object[].class));
}
Copy the code

The above is the introduction of CopyOnWriteList initialization, the three construction methods are relatively easy to understand, the next is to look at the implementation of a few main methods

Add elements

Below is an implementation of the Add (E E) method, with detailed comments

public boolean add(E e) {
    // Obtain an exclusive lock
    final ReentrantLock lock = this.lock;
    / / lock
    lock.lock();
    try {
        // Get the array underlying the list
        Object[] elements = getArray();
        // Get the array length
        int len = elements.length;
        // Copy the new array to len+1
        Object[] newElements = Arrays.copyOf(elements, len + 1);
        // Assign a value to the end of the new array
        newElements[len] = e;
        // Replace the original array with the new array
        setArray(newElements);
        return true; 
    } finally {
        lock.unlock();/ / releases the lock}}Copy the code

Summarize the execution flow of the ADD method

The thread that calls the add method first acquires the lock and then calls the lock method to lock the list. (Those who know ReentrantLock know that this is an exclusive lock, so only one thread in multiple threads will acquire the lock.)
Only one thread will acquire the lock, so only one thread will update the array, while all other threads calling the add method are blocked and waiting
The thread that acquired the lock continues execution
- First get the original array and its length, then copy the elements into a newArray (newArray length is the original length +1)
- Given an array with index len+1
- Replace the old array with the new array
Finally release the lock

So in summary, only one thread can acquire the lock, add the elements by copying the original array, replace the original array with the new array, and release the lock (another add thread executes it).

Finally, the length of the array is not fixed, it will be +1 after each write. So CopyOnWriteList does not have a length or size property, but it does provide a size() method to get the actual size of the set. Size () is as follows

public int size(a) {
    return getArray().length;
}
Copy the code

Access to elements

Use get(I) to retrieve the element at position I, but of course an out-of-bounds exception will be thrown if the element does not exist.

public E get(int index) {
    return get(getArray(), index);
}
final Object[] getArray() {
    return array;
}
private E get(Object[] a, int index) {
    return (E) a[index];
}
Copy the code

Of course, the get method here also reflects the weak conformance problem of copy-on-write-list. Let’s briefly illustrate this with the diagram below. The assumption given in the figure is that threadA accesses an element at index=1

① Obtain an array array
② Access the element with the passed parameter subscript

Because we see that the GET process is unlocked (assuming there are three elements in the array, as shown in the figure). If threadA performs the remove(1) operation before threadB performs the remove(1) operation, threadB or obtains the exclusive lock, and then performs the copy on write operation, that is, copies a newArray, neArray, and then performs the remove(1) operation in newArray to update the array. When threadB finishes executing, the array element with index=1 is already item3.

ThreadA then continues, but since threadA is operating on an element of the original array, its index=1 is still item2. So the end result is that threadB removes the element at position 1, but threadA still accesses the element of the original array. This is the problem of consistency

Modify the element

A change is also a write, so you need to get the lock. Here is the implementation of the set method

public E set(int index, E element) {
    / / acquiring a lock
    final ReentrantLock lock = this.lock;
    // Perform the lock
    lock.lock();
    try {
        // Get array
        Object[] elements = getArray();
        // Get the element at the index position
        E oldValue = get(elements, index);
        // The value to be modified is not equal to the original value
        if(oldValue ! = element) {// Get the length of the old array
            int len = elements.length;
            // Copy to a new array
            Object[] newElements = Arrays.copyOf(elements, len);
            // Set the element values in the new array
            newElements[index] = element;
            // Replace the array with the new array
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            // To ensure volatile semantics, replace the array with the new one, even if no changes have been made
            setArray(elements);
        }
        return oldValue; // Return the old value
    } finally {
        lock.unlock();/ / releases the lock}}Copy the code

When you look at the set method, it’s actually similar to the add method.

Acquire an exclusive lock to ensure that only one thread can modify the array at a time
Gets the current array, calling the get method to get the array elements at the specified location
Determine the value obtained by GET and the parameters passed in
- In order to guarantee volatile semantics, we still need to re-create the array
- If not, copy the original array elements into the new array, and then change the index of the new array, and replace the original array with the new array
Release the lock

Remove elements

The following is the implementation of the remove method

Acquire an exclusive lock to ensure that only one thread can delete the element
Count the number of array elements to move
- If we delete the last element and the result is 0, we replace the array with len-1
- If you do not delete the last element, you can divide the index into two parts, copy them into the new array, and replace them
Release the lock

public E remove(int index) {
    / / acquiring a lock
    final ReentrantLock lock = this.lock;
    / / lock
    lock.lock();
    try {
        // Get the original array
        Object[] elements = getArray();
        // Get the original array length
        int len = elements.length;
        // Get the value of the original array index
        E oldValue = get(elements, index);
        // Since removing elements from an array requires moving, this is how many elements to move
        int numMoved = len - index - 1;
        // Calculate numMoved=0 to delete the last element.
        // Copy len-1 from the original array to the new array, replace the old array
        if (numMoved == 0)
            setArray(Arrays.copyOf(elements, len - 1));
        // It is not the last element to delete
        else {
            // Create an array of length len-1
            Object[] newElements = new Object[len - 1];
            // Copy the elements before index into the new array
            System.arraycopy(elements, 0, newElements, 0, index);
            // Copy the elements after index into the new array
            System.arraycopy(elements, index + 1, newElements, index,
                             numMoved);
            // Replace the array with the new array
            setArray(newElements);
        }
        return oldValue;// Return the old value
    } finally {
        lock.unlock();/ / releases the lock}}Copy the code

The iterator

The basic use of iterators is as follows. The hashNext() method determines if there are any more elements, and the next method returns the specific elements.

CopyOnWriteArrayList list = newCopyOnWriteArrayList(); Iterator<? > itr = list.iterator();while(itr.hashNext()) {
    //do sth
    itr.next();
}
Copy the code

The iterator in CopyOnWriteArrayList is weakly consistent (it gets the iterator first, but it’s not visible to the iterator if another thread changes the list after it gets the iterator). Here’s the implementation in CopyOnWriteArrayList

//Iterator
       itr = list.iterator();
public Iterator<E> iterator(a) {
    GetArray () = oldArray ()
    // We then call the COWIterator constructor to create an iterator object with oldArray as an argument
    // As you can see from the COWIterator class below, one of its members stores a copy of oldArray
    return new COWIterator<E>(getArray(), 0);
}
static final class COWIterator<E> implements ListIterator<E> {
    // Array snapshot version
    private final Object[] snapshot;
    // The element index (array subscript) returned by a subsequent call to next
    private int cursor;
    / / the constructor
    private COWIterator(Object[] elements, int initialCursor) {
        cursor = initialCursor;
        snapshot = elements;
    }
    // Whether the variable ends: the subscript is less than the array length
    public boolean hasNext(a) {
        return cursor < snapshot.length;
    }
    // Whether there are precursors
    public boolean hasPrevious(a) {
        return cursor > 0;
    }
    // Get the element
    //hasNext() returns true and gets the value directly from the subscript of the cursor record
    //hasNext() returns false and raises an exception
    public E next(a) {
        if (! hasNext())
            throw new NoSuchElementException();
        return (E) snapshot[cursor++];
    }
    //other method...
}
Copy the code

In the above code we can see that the list iterator() method actually returns a COWIterator whose snapshot member holds the contents of the current array in the list. But snapshot is a snapshot of the array. Why

We are passing the current array, but it is possible that another thread has changed the array and replaced it, so that the array in the list and the snapshot will not refer to the same array and will exist as a snapshot of the original array. Then the iterator is not accessing the updated array. This is weak consistency

Let’s look at the following example

public class TestCOW {

    private static CopyOnWriteArrayList<String> list = new CopyOnWriteArrayList();

    public static void main(String[] args) throws InterruptedException {
        list.add("item1");
        list.add("item2");
        list.add("item3");

        Thread thread = new Thread() {
            @Override
            public void run(a) {
                list.set(1."modify-item1");
                list.remove("item2"); }};// The main line gets the iterator first
        Iterator<String> itr = list.iterator();
        thread.start();// Start the thread
        thread.join();// Let the main thread wait for the thread to finish executing, and then iterate to see if the output is modified
        while (itr.hasNext()) {
            System.out.println(Thread.currentThread().getName() + "Elements of list in thread :"+ itr.next()); }}}Copy the code

The result is as follows. In fact, we added a few elements to the list, then modified the list in thread, and let the main thread get the list iterator first, wait for thread to finish executing, and then print the elements in the list. We found that the main thread didn’t notice the array change in the list. The output is still the same list, which is a sign of weak consistency.

List element: Item1. List element: Item2. List element: Item3

conclusion

How is CopyOnWriteArrayList guaranteedwriteThread-safe: Use ReentrantLock to exclusive the lock, ensuring that only one thread is working on the collection at a timewriteoperation
The data is stored in an array in CopyOnWriteArrayList, and the array length changes dynamically (writeThis will update the array.)
When you modify an array, instead of directly manipulating the array, you copy a new array, and then replace the old array with the new array
When using the iterator for traversing don’t lock, not throw ConcurrentModificationException, because use iterators iterate through the operation is a copy of the array (of course, this is in the other thread to modify a list)

Set method details

Notice a piece of code in the set method that looks like this

else { //oldValue = element (element is the parameter passed)
    // Not quite a no-op; ensures volatile write semantics
    // To ensure volatile semantics, replace the array with the new one, even if no changes have been made
    setArray(elements);
}
Copy the code

If you want to specify the position, the value you want to change is the same as the value in the array, but you still need to call the set method to update the array. Why? Take a look at the passage first

The methods of all classes in java.util.concurrent and their subpackages extend these guarantees for higher levels of synchronization. In particular: the action before a thread puts an object in any concurrent collection happens -before ** subsequent action ** to access or remove the element from a collection in another thread.

Think of this as ensuring that the series of operations that precede the set operation happen-before are followed by other threads accessing the array (without locking). See the example below

// This is the initial case for two threads
int nonVolatileField = 0; // A variable not modified by volatile
/ / pseudo code
CopyOnWriteArrayList<String> list = {"x"."y"."z"}

// Thread 1
// (1) Here nonVolatileField is updated
nonVolatileField = 1;
// (2) This is the set() modification (write) operation. Note that this is the write operation on a volatile array
list.set(0."x");

// Thread 2
// (3) This is the access (read) operation
String s = list.get(0);
// (4) Use nonVolatileField
if (s == "x") {
    int localVar = nonVolatileField;
}
Copy the code

Assuming the above scenario exists, if it is guaranteed that only such trajectories as (1)->(2)->(3)->(4) can exist. According to the convention in the Java API documentation above

(2) occur-before and (3), the operations in the thread are (1) occur-before and (2), (3) occur-before and (4), Reading and writing nonVolatileField variables based on the transitivity of occur-before can have (1) occur-before and (4)

Therefore, the write of Thread 1 to nonVolatileField is visible to the read of Thread a in Thread 2. If setArray(Elements) is not written to volatile variables in the set else of CopyOnWriteArrayList, (2)happen-before and (3) will no longer be available, and the above visibility will not be guaranteed.

So just to make sure that the series of operations that precede the set operation happen before other threads access the array’s (unlocked) subsequent operations,