CopyOnWriteArrayList is a concurrency utility class introduced by Doug Lea in JDK1.5, CopyOnWriteArrayList is actually a thread-safe ArrayList, but it’s a little different and it’s a little bit like the relationship between HashMap and ConcurrentHashMap. All modification operations (add/set, etc.) copy the underlying dependent array and modify it on top of it. However, we know that copying an array is a time-consuming operation, so it is usually used in scenarios with more reads and less writes, such as random access and traversal.
The working principle of
First of all, CopyOnWriteArrayList has a reentrant lock to keep it thread-safe when modifying (add/set, etc.), and an array to store the actual data and use volatile to keep it visible.
final transient ReentrantLock lock = new ReentrantLock();
private transient volatile Object[] array;
final Object[] getArray() {
return array;
}
final void setArray(Object[] a) {
array = a;
}
Copy the code
How ADD() works
If you look at the code for ArrayList, CopyOnWriteArrayList is a lot easier.
/** * Creates an empty list. */ public CopyOnWriteArrayList() { setArray(new Object[0]); } public boolean add(E e) { final ReentrantLock lock = this.lock; lock.lock(); try { Object[] elements = getArray(); int len = elements.length; Object[] newElements = Arrays.copyOf(elements, len + 1); newElements[len] = e; setArray(newElements); return true; } finally { lock.unlock(); }}Copy the code
We’ll notice that CopyOnWriteArrayList initializes an empty array by default, and that add() doesn’t measure the current array’s capacity like ArrayList does and expand it (like ensureCapacity). The basic steps for adding elements to the array:
-
Will try to lock first
-
The getArray() method is called to get a reference to the current array and save it to a local variable. Using this method, you can remove a GETFIELD call and save a snapshot of the current reference. This way, even if other threads modify the reference concurrently, at least the consistency of the method execution is guaranteed. Of course, the direct lock guarantees that there will be no concurrent modification, so there is no problem.
-
Copy the contents of the current array to a new array whose size is the length of the old array +1, so each new operation causes the length of CopyOnWriteArrayList to increase.
-
Add the elements to the new array when the copy is complete.
-
Replace the current array with a new one, and use the volatile modifier to ensure subsequent visibility to other threads
All other modification methods are the same, using the same locking mechanism:
public E remove(int index) { final ReentrantLock lock = this.lock; lock.lock(); try { Object[] elements = getArray(); int len = elements.length; E oldValue = get(elements, index); int numMoved = len - index - 1; if (numMoved == 0) setArray(Arrays.copyOf(elements, len - 1)); else { Object[] newElements = new Object[len - 1]; System.arraycopy(elements, 0, newElements, 0, index); System.arraycopy(elements, index + 1, newElements, index, numMoved); setArray(newElements); } return oldValue; } finally { lock.unlock(); }}Copy the code
read
It’s a lot easier to read, but there’s no lock, so there may be a delay in reading the latest data. The getArray() method is a snapshot, and before the modification is complete, We read the contents of the snapshot array, and the same is true for traversal, where the snapshot array is used to construct a new constructor. Therefore, traversal does not need to be locked. However, subsequent add/remove/set operations do not affect the iterator, and iterators do not support remove operations. There would be no throw ConcurrentModificationException.
public E get(int index) {
return get(getArray(), index);
}
public Iterator<E> iterator() {
return new COWIterator<E>(getArray(), 0);
}Copy the code
conclusion
CopyOnArrayList uses scenarios with more reading and less writing, and it is better not to store too many objects. Adding CopyOnArrayList stores more data, then every modification will cause a big object copy, resulting in YGC or even FULL GC, so you must consider the scenario before using. In addition, the latest data cannot be read due to the delay caused by snapshot reads.