preface
List interface is very common in Java, common List interface implementation classes such as ArrayList, LinkedList, they have a wide range of applications. However, both lists are thread-unsafe. This article will introduce a thread-safe ArrayList: CopyOnWriteArrayList, which will be analyzed and explained in depth. Initialize, expand, add, delete, change, and check the basic operation and principle of ArrayList.
Basic structure and initialization
The basic structure
public class CopyOnWriteArrayList<E>
implements List<E>, RandomAccess.Cloneable.java.io.Serializable {
private static final long serialVersionUID = 8673264195747942595L;
/** * The lock protecting all mutators. (We have a mild preference * for builtin monitors over ReentrantLock when either will do.) */
// Writers' idiosyncrasies
final transient Object lock = new Object(a);// An array of data
private transient volatile Object[] array;
}
Copy the code
First look at the basic structure of CopyOnWriteArrayList, as shown above. An Object lock is used as a Synchronized lock, and an Object array is used to store data, named array. It is volatile and ensures visibility to different threads.
As noted in the comments, the authors of the source code prefer Synchronized locks when both Java’s built-in Synchronized locks and ReentrantLocks are available.
Initialize the
// No arguments
public CopyOnWriteArrayList() {
setArray(new Object[0]);
}
// Use set to construct
public CopyOnWriteArrayList(Collection<? extends E> c) {
Object[] es;
if(c.getClass() == CopyOnWriteArrayList.class) es = ((CopyOnWriteArrayList<? >)c).getArray();else {
es = c.toArray();
if(c.getClass() ! = java.util.ArrayList.class) es = Arrays.copyOf(es, es.length,Object[].class);
}
setArray(es);
}
// Use an array
public CopyOnWriteArrayList(E[] toCopyIn) {
setArray(Arrays.copyOf(toCopyIn, toCopyIn.length, Object[].class));
}
/ / setArray method
final void setArray(Object[] a) {
array = a;
}
/ / getArray method
final Object[] getArray() {
return array;
}
Copy the code
Three initialization methods:
- The one without parameters is the simplest, needless to say;
- Construct with collections: If the collection is also CopyOnWriteArrayList, then simply make es point to the array that the other array points to, otherwise, run ArrayList if it is
es = c.toArray();
If not ArrayList, executees = Arrays.copyOf(es, es.length, Object[].class);
Somebody asked, this array is es, and then callArrays.copyOf(es, es.length, Object[].class)
What’s the point? This three-parameter version of Array. copyOf just gets a new array of es elements with es. Length, but Object type, so it still makes sense. For some collections, the toArray method may return an array of type other than Object. I also talked about this in the ArrayList blog. - The three-parameter arrays. copyOf method is already covered and not explained.
The write operation
So let’s look at write operations first, so in CopyOnWriteArrayList, if you make any changes to the original array, it’s a write operation, it’s a lock, even if it’s just a set method. (Recall that ArrayList has a modCount property that indicates if the array has changed structurally, but ArrayList’s set method does not change modCount, so CopyOnWriteArrayList is very strict for photo.)
Set method
public E set(int index, E element) {
/ / lock
synchronized (lock) {
Object[] es = getArray();
E oldValue = elementAt(es, index);
// It does need to be modified
if(oldValue ! = element) {// Make a copy
es = es.clone();
/ / modify
es[index] = element;
}
setArray(es);
returnoldValue; }}Copy the code
As you can see, a set needs to grab a lock. After all, it is a thread-safe collection, which MAKES sense to me. If you do need to modify, then es = es.clone(); Make es point to the cloned array. The name is es, but now there are two sets of numbers. Finally, run setArray(es); Make array point to the new array.
It is not hard to see that just a simple set operation, but to copy the entire array, if the array is large, the efficiency will be less.
The add operation
Add an element
There are several versions of this method, but let’s start with the simplest two:
public boolean add(E e) {
synchronized (lock) {
Object[] es = getArray();
int len = es.length;
// Clone a new array of length + 1
es = Arrays.copyOf(es, len + 1);
// End assignment
es[len] = e;
setArray(es);
return true;
}
}
public void add(int index, E element) {
synchronized (lock) {
Object[] es = getArray();
int len = es.length;
if (index > len || index < 0)
throw new IndexOutOfBoundsException(outOfBounds(index, len));
Object[] newElements;
// The number of elements to move
int numMoved = len - index;
// Do not move, direct tail assignment
if (numMoved == 0)
newElements = Arrays.copyOf(es, len + 1);
// Otherwise, move
else {
newElements = new Object[len + 1];
System.arraycopy(es, 0, newElements, 0, index);
System.arraycopy(es, index, newElements, index + 1,
numMoved);
}
/ / assignmentnewElements[index] = element; setArray(newElements); }}Copy the code
The first add method appends an element to the end, cloning an array of length + 1 and assigning it. The second add method is similar to the add method of an ArrayList, which looks at how many elements need to be moved back, moves them accordingly, and assigns values at index. But there are differences:
System.arraycopy(elementData, index,
elementData, index + 1,
s - index);
elementData[index] = element;
Copy the code
An ArrayList just moves the elements after index back, not before. CopyOnWriteArrayList does all of that, which makes sense, because CopyOnWriteArrayList declares a new array, and of course everything gets copied.
You can also see that CopyOnWriteArrayList doesn’t need to be expanded. Why? ArrayList needs to be expanded because, at least until the next expansion, the add operation is still assigned to the elemantData array. It does not create a new array every time it is added, so it can expand the capacity a bit more and avoid frequent expansion and linear time complexity of copying elements. CopyOnWriteArrayList copies a new array every time you add it, so why expand it? Even if you expand it, you’ll copy a new array next time. So CopyOnWriteArrayList makes a new array of length plus 1 every time I add it.
And then addIfAbsent, which is a little harder:
public boolean addIfAbsent(E e) {
Object[] snapshot = getArray();
// use the logic expression short circuit property
return indexOfRange(e, snapshot, 0, snapshot.length) < 0
&& addIfAbsent(e, snapshot);
}
private boolean addIfAbsent(E e, Object[] snapshot) {
synchronized (lock) {
// Actually add elements, a bunch of code, and look at it later}}Copy the code
IndexOfRange (e, snapshot, 0, snapshot.length);
private static int indexOfRange(Object o, Object[] es, int from, int to) {
if (o == null) {
for (int i = from; i < to; i++)
if (es[i] == null)
return i;
} else {
for (int i = from; i < to; i++)
if (o.equals(es[i]))
return i;
}
return -1;
}
Copy the code
There’s nothing to explain. Look for elements from beginning to end!
AddIfAbsent (e, snapshot, 0, snapshot. Length) < 0 && addIfAbsent(e, snapshot); When the front part is true, that is, there is no element, the real Add will be called further, otherwise it won’t go down at all. This kind of short-circuiting is all too common in Java source code… Let’s look at the real add:
private boolean addIfAbsent(E e, Object[] snapshot) {
synchronized (lock) {
// The array to which the new array points
Object[] current = getArray();
int len = current.length;
// The array has changed
if(snapshot ! = current) { int common =Math.min(snapshot.length, len);
for (int i = 0; i < common; i++)
if(current[i] ! = snapshot[i] && Objects.equals(e, current[i]))// This element appears in the first half of current, return false
return false;
The element // // appears in the second half of current, return false
if (indexOfRange(e, current, common, len) >= 0)
return false;
}
Object[] newElements = Arrays.copyOf(current, len + 1);
newElements[len] = e;
setArray(newElements);
return true; }}Copy the code
Since this element is not in the array, we can just append it to the end. Why do we need all this code? Object[] snapshot = getArray(); But this is only a snapshot, the content of this snapshot is determined at the moment of generation, will not change. As we know from the above analysis of some writes, CopyOnWriteArrayList changes the array reference whenever there are writes. So maybe you took a little too long to do an indexOfRange on snapshot. Although you didn’t find the element in snopShot, maybe during that time some other write changed the array reference of CopyOnWriteArrayList. When you’re done looking for snopshot and the element doesn’t exist, some other operation causes the element to be in the reference array of CopyOnWriteArrayList. Synchronized (lock) {synchronized (lock) {synchronized (lock) {synchronized (lock) {synchronized (lock) {synchronized (lock) {
That explains why we have all this code, so let’s take a closer look at this code. Get current snapshot of CopyOnWriteArrayList and len of its length. Then determine if (snapshot! = current), if the two snapshots are the same, then great, don’t worry about what I said above, just append elements!
Otherwise, it goes into the if block. int common = Math.min(snapshot.length, len); For (int I = 0; i < common; I++) loops through the search. The code looked up here is also interesting:
if(current[i] ! = snapshot[i] && Objects.equals(e, current[i]))// This element appears in the first half of current, return false
return false;
Copy the code
This is a short circuit operation again. The first part of the logical expression: current[I]! = snapshot[I]; Current [I] == snapshot[I]; current[I] == snapshot[I]; Only when current[I]! Current [I] = true; objects.equals (e, current[I]); Return false (T_T). The == operation is faster than objects.equals ().
If (indexOfRange(e, current, common, len) >= 0) Int common = math.min (snapshot.length, len); There are so many elements, maybe not all of them.
After the comparison, if the element is still missing in the new snapshot, append it!
No other write will change the array reference during the current snapshot. The answer is no, notice that the private Boolean addIfAbsent function is essentially a lock grabber, so if you get the current snapshot, no other thread can write to change the array reference until you leave the code.
Add a bunch of elements
Let’s start with the simple version:
public boolean addAll(Collection<? extends E> c) {
// Get the corresponding elements
Object[] cs = (c.getClass() == CopyOnWriteArrayList.class) ? ((CopyOnWriteArrayList<? >)c).getArray() : c.toArray();// Nothing to add
if (cs.length == 0)
return false;
synchronized (lock) {
Object[] es = getArray();
int len = es.length;
Object[] newElements;
// If the array length is 0 and the collection is ArrayList or CopyOnWriteArrayList, assign the value directly
if (len == 0 && (c.getClass() == CopyOnWriteArrayList.class ||
c.getClass() == ArrayList.class)) {
newElements = cs;
// Otherwise, append these elements to the tail
} else {
newElements = Arrays.copyOf(es, len + cs.length);
System.arraycopy(cs, 0, newElements, len, cs.length);
}
setArray(newElements);
return true; }}Copy the code
This code is also fairly simple: after some judgment (for example, if the set to be added is empty, return false), append the element to the end of the array. Public Boolean addAll(int index, Collection
c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c) extends E> c Look at this one, which is a little harder:
public int addAllAbsent(Collection<? extends E> c) {
Object[] cs = c.toArray();
if(c.getClass() ! = ArrayList.class) { cs = cs.clone(); }// Set is empty, no need to add, return directly
if (cs.length == 0)
return 0;
synchronized (lock) {
// A bunch of complex code, explained later}}Copy the code
This function adds elements to the array if there are no elements in the array. And then we get an array of the elements of that set. This one has this judgment:
if(c.getClass() ! = ArrayList.class) { cs = cs.clone(); }Copy the code
The toArray of some collections returns its own array, for fear of contaminate the original array of collections. Then, if the set is empty, it simply returns, and then looks at the code in the synchronized block, which actually adds:
synchronized (lock) {
// Get your own snapshot, after which no thread will change the snapshot
Object[] es = getArray();
int len = es.length;
int added = 0;
// Add
for (int i = 0; i < cs.length; ++i) {
Object e = cs[i];
// e cannot appear in CopyOnWriteArrayList
// Also cannot add duplicate e
if (indexOfRange(e, es, 0, len) < 0 &&
indexOfRange(e, cs, 0, added) < 0)
cs[added++] = e;
}
// There are qualified elements to add
if (added > 0) {
// append the tail
Object[] newElements = Arrays.copyOf(es, len + added);
System.arraycopy(cs, 0, newElements, len, added);
setArray(newElements);
}
return added;
}
Copy the code
Object e = cs[I]; And then we have an if, the first condition indexOfRange(e, es, 0, len) < 0 and it makes sense that this element, e, really doesn’t exist in our array, but what about the second condition? IndexOfRange (e, cs, 0, Added) < 0, note that added is initialized to 0, so if the first condition does not exist in CopyOnWriteArrayList, the second condition must be true, and added becomes 1. If a second e appears that satisfies the first condition, go through the second condition, meaning that not only can this E not appear in CopyOnWriteArrayList, but also cannot appear in cs[0:added], that is, cannot repeatedly add elements. This is also consistent with the semantics of addAllAbsent, so if you add the same element twice, it might conflict a little bit with this Absent.
Finally, the element cs[0:added] is added to the tail.
The remove operation
Remove an element
The easiest way to delete is by subscript:
public E remove(int index) {
synchronized (lock) {
Object[] es = getArray();
int len = es.length;
E oldValue = elementAt(es, index);
int numMoved = len - index - 1;
Object[] newElements;
// Delete the last element
if (numMoved == 0)
newElements = Arrays.copyOf(es, len - 1);
// Otherwise, copy the elements and move
else {
newElements = new Object[len - 1];
System.arraycopy(es, 0, newElements, 0, index);
System.arraycopy(es, index + 1, newElements, index,
numMoved);
}
setArray(newElements);
returnoldValue; }}Copy the code
Int numMoved = len-index – 1; NewElements = array.copyof (es, len-1); newElements = array.copyof (es, len-1); Otherwise, copy the element at 0-index to newElements, and move the element after index forward to newElements.
Delete by element:
// Check if the element is available
public boolean remove(Object o) {
Object[] snapshot = getArray();
int index = indexOfRange(o, snapshot, 0, snapshot.length);
// Remove from snapshot
return index >= 0 && remove(o, snapshot, index);
}
private boolean remove(Object o, Object[] snapshot, int index) {
synchronized (lock) {
// A bunch of operations, more on that later}}Copy the code
Return index >= 0 && remove(o, snapshot, index); , still short circuit operation. Now look at the synchronized code block:
synchronized (lock) {
// Get the current snapshot
Object[] current = getArray();
int len = current.length;
if(snapshot ! = current) findIndex: { int prefix =Math.min(index, len);
// The first half
for (int i = 0; i < prefix; i++) {
if(current[i] ! = snapshot[i] && Objects.equals(o, current[i])) { index = i;breakfindIndex; }}// Some boundary judgments
if (index >= len)
return false;
if (current[index] == o)
break findIndex;
// Find the second part (if any)
index = indexOfRange(o, current, index, len);
if (index < 0)
return false;
}
// The general operation to delete the element at index
Object[] newElements = new Object[len - 1];
System.arraycopy(current, 0, newElements, 0, index);
System.arraycopy(current, index + 1,
newElements, index,
len - index - 1);
setArray(newElements);
return true;
}
Copy the code
if (snapshot ! = current) if it is false, delete it directly.
Otherwise, you have to go through the if code. Int prefix = math.min (index, len); If (current[I]! = if (current[I]! = if (current[I]! = if (current[I]! = if (current[I]! = snapshot[I] && objects.equals (o, current[I])) { Only the current [I]! = snapshot[I] = objects.equals (o, current[I]). If I is found that satisfies objects.equals (o, current[I]), assign it to index, and break pops out;
If index >= len is not found, return false; if index >= len is not found, return false. If (current[index] == o); if (current[index] == o); Index = indexOfRange(o, current, len); If (index < 0), return false.
If you find the element, run the code that deletes the element at index.
Remove a bunch of elements
Let’s start with the simple version:
void removeRange(int fromIndex, int toIndex) {
synchronized (lock) {
Object[] es = getArray();
int len = es.length;
// subscript check
if (fromIndex < 0 || toIndex > len || toIndex < fromIndex)
throw new IndexOutOfBoundsException();
/ / the new length
int newlen = len - (toIndex - fromIndex);
// How many elements move forward
int numMoved = len - toIndex;
// No elements need to be moved
if (numMoved == 0)
setArray(Arrays.copyOf(es, newlen));
// Perform a series of moves
else {
Object[] newElements = new Object[newlen];
System.arraycopy(es, 0, newElements, 0, fromIndex); System.arraycopy(es, toIndex, newElements, fromIndex, numMoved); setArray(newElements); }}}Copy the code
Array.copyof (es, newlen)) setArray(Array.copyof (es, newlen)); Otherwise, copy the elements from 0 to fromIndex and then toIndex to Len. It’s just shuffle plus copy. A little harder:
// Delete the element in the array in collection C
public boolean removeAll(Collection
c) {
Objects.requireNonNull(c);
return bulkRemove(e -> c.contains(e));
}
// Keep the elements in the array in c, delete the elements not in C
public boolean retainAll(Collection
c) {
Objects.requireNonNull(c);
returnbulkRemove(e -> ! c.contains(e)); } private booleanbulkRemove(Predicate<? super E> filter) {
synchronized (lock) {
return bulkRemove(filter, 0, getArray().length); }}Copy the code
The conditions for deletion are represented by lambda expressions, and bulkRemove is a functional interface to the Predicate. Take a look at the three-argument version of bulkRemove.
This is a clever way to delete a long array and a bit operation. It is not very easy to understand. I have described this method in detail in this blog post. RemoveIf, I’m not going to copy and paste it here, but it’s an interesting idea.
So let’s stop here, and what about replace methods? They’re too simple to make sense. To summarize, any change to the array must trigger a copy of all elements of the array, as well as the setArray method, even the simplest set method.
A read operation
Read operation is much easier, here are a few common read operation code:
/ / get methods
public E get(int index) {
return elementAt(getArray(), index);
}
static <E> E elementAt(Object[] a, int index) {
return (E) a[index];
}
// find the subscript according to the element
public int indexOf(Object o) {
Object[] es = getArray();
return indexOfRange(o, es, 0, es.length);
}
private static int indexOfRange(Object o, Object[] es, int from, int to) {
if (o == null) {
for (int i = from; i < to; i++)
if (es[i] == null)
return i;
} else {
for (int i = from; i < to; i++)
if (o.equals(es[i]))
return i;
}
return -1;
}
Copy the code
As you can see, reading doesn’t require locking or copying array elements, it calls a getArray method, takes the snapshot, and looks it up.
CopyOnWriteArrayList
After reading most of the CopyOnWriteArrayList read and write operation code, we can summarize its read and write characteristics:
- Write operations lock, lock, only one thread can write, and write to copy the entire array of elements
- Read operations do not lock and call the getArray method to get a snapshot, which is searched
- Writing is mutually exclusive (because locks are stolen), reading is not mutually exclusive, reading is not mutually exclusive
- The result of the read operation may not be the latest because it is only the contents of a snapshot. The result of the read operation is determined as soon as the snapshot is created
- Write operations do not modify the existing array/snapshot directly, so read operations do not have to worry about reading while snapshot is being modified by other write operations
- After that, setArray will change the array array of the current CopyOnWriteArrayList
The iterator
Initialization and basic structure
This is a function of several iterators that pass the COWIterator method a snapshot of getArray and the starting index.
public Iterator<E> iterator() {
return new COWIterator<E>(getArray(), 0);
}
public ListIterator<E> listIterator() {
return new COWIterator<E>(getArray(), 0);
}
public ListIterator<E> listIterator(int index) {
Object[] es = getArray();
int len = es.length;
if (index < 0 || index > len)
throw new IndexOutOfBoundsException(outOfBounds(index, len));
return new COWIterator<E>(es, index);
}
Copy the code
Let’s take a look at its basic structure and initialization:
static final class COWIterator<E> implements ListIterator<E> {
// Snapshot of the iterator
private final Object[] snapshot;
// The array index to be accessed
private int cursor;
/ / initialization
COWIterator(Object[] es, int initialCursor) {
cursor = initialCursor;
snapshot = es;
}
// Other operation codes
}
Copy the code
As you can see, the iterator itself maintains a snapshot of CopyOnWriteArrayList with cursor as the next array index to access.
Next and Previous methods
public boolean hasNext() {
return cursor < snapshot.length;
}
public boolean hasPrevious() {
return cursor > 0;
}
@SuppressWarnings("unchecked")
public E next() {
if (! hasNext())
throw new NoSuchElementException();
return (E) snapshot[cursor++];
}
@SuppressWarnings("unchecked")
public E previous() {
if (! hasPrevious())
throw new NoSuchElementException();
return (E) snapshot[--cursor];
}
Copy the code
Look at if (! HasNext ()) or if (! HasPrevious ()), the code level is really simple, not to say.
The write operation
public void remove() {
throw new UnsupportedOperationException();
}
public void set(E e) {
throw new UnsupportedOperationException();
}
public void add(E e) {
throw new UnsupportedOperationException();
}
Copy the code
Doug Lea doesn’t want the iterator of CopyOnWriteArrayList to have the ability to modify…
Iterator summary
Iterators also do this by taking a snapshot of CopyOnWriteArrayList at some point in time.
Copy ideas as you write
Copy on write in Linux
When I was in the operating system class, I remember the teacher assigned this kind of parent-child process creation and communication homework. In Linux applications, the fork () will produce a child the same and the parent process, for the sake of efficiency, Linux is no one by copying the data of the parent for the child, but the introduction of the “write copy” technology, which is only the content of the process space of paragraphs to change, will be a copy of the contents of the parent for the child process.
The application of the idea of copying while writing
Redis RDB persistence
At my current level, in addition to the Java CopyOnWriteArrayList and CopyOnWriteArraySet (based on CopyOnWriteArrayList, very little code), The application also knows about Redis’ RDB persistence.
There are two kinds of RDB persistence. One is to call save and persist by the main process, which will cause Redis blocking of single process for a long time. It is generally not recommended. The other option is to call bgSave and fork a child process to persist. RDB persistence is read-only and does not modify the original data, so copy-on-write is also a good idea. The snapshots available to the RDB are determined as soon as the child process is created.
conclusion
This article mainly on CopyOnWriteArrayList of the main source code for the analysis, due to the workload is relatively large, some source code such as subList did not speak, you can be interested in their own to see, relatively simple.