This article explores and understands thread safety from an ArrayList-related “thread safety” issue that was discovered during code review.
Case analysis
A couple of days ago, in the process of code Review, I saw a friend write something like the following:
List<String> resultList = new ArrayList<>();
paramList.parallelStream().forEach(v -> {
String value = doSomething(v);
resultList.add(value);
});
Copy the code
In my mind, ArrayList is not thread safe, but in this case, the same ArrayList object will be overwritten by multiple threads. I feel that there is a problem with this way of writing, so I look at the implementation of ArrayList to confirm the problem, and review the relevant knowledge.
First post a concept:
Thread safety is a term used in programming. When a function or function library is called in a multi-threaded environment, it can correctly handle the shared variables between multiple threads, so that the program functions can be completed correctly. — Wikipedia
Let’s take a look at the ArrayList source code for key information relevant to this topic:
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess.Cloneable.java.io.Serializable
{
// ...
/** * The array buffer into which the elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer... * /
transient Object[] elementData; // non-private to simplify nested class access
/** * The size of the ArrayList (the number of elements it contains). */
private int size;
// ...
/** * Appends the specified element to the end of this list... * /
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
// ...
}
Copy the code
Here are a few things to note about ArrayList:
- Use arrays to store data, i.e
elementData
- Use int member variables
size
Record the actual number of elements add
Method logic and execution order:- perform
ensureCapacityInternal(size + 1)
: confirmelementData
Whether the capacity is enough, if not enough, expand half (apply for a new large array, willelementData
And then assign the new large array toelementData
) - perform
elementData[size] = e;
- perform
size++
- perform
For the sake of understanding the “thread-safety issue” discussed here, let’s choose the simplest execution path. Suppose we have two threads A and B calling arrayList. add at the same time, and elementData has A capacity of 8 and A size of 7, enough to hold A new element. So what is likely to happen?
One possible order of execution is:
- Threads A and B are executing simultaneously
ensureCapacityInternal(size + 1)
as7 + 1
And no more thanelementData
The capacity is 8, so it is not expanded - Thread A executes first
elementData[size++] = e;
At this time,size
Into 8 - Thread B executes
elementData[size++] = e;
Because theelementData
Array length 8, but accesselementData[8]
, array index out of bounds
The program will throw an exception and not execute properly, which is clearly a thread-unsafe situation according to the definition of thread-safety mentioned above.
Construct sample code validation
With that in mind, let’s write a simple example code to verify that the above problem can indeed occur:
List<Integer> resultList = new ArrayList<>();
List<Integer> paramList = new ArrayList<>();
int length = 10000;
for (int i = 0; i < length; i++) {
paramList.add(i);
}
paramList.parallelStream().forEach(resultList::add);
Copy the code
It is possible to execute the above code normally, but it is more likely to encounter the following exceptions:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:598)
at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583)
at concurrent.ConcurrentTest.main(ConcurrentTest.java:18)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1234
at java.util.ArrayList.add(ArrayList.java:465)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1067)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1703)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:172)
Copy the code
According to my test, when the length value is small, it is difficult to reproduce because the number of times needed to expand the capacity edge is small. When the length value is large, the exception throw rate is very high.
In addition to throwing this exception, the above scenario can cause data overwrite/loss, and the actual number of elements in the ArrayList does not match the value of size.
The solution
A common and effective solution to this problem is to lock access to shared resources.
After I put forward the revision suggestion of code review, my partner will write the first code in the
List<String> resultList = new ArrayList<>();
Copy the code
Modified in order to
List<String> resultList = Collections.synchronizedList(new ArrayList<>());
Copy the code
This will eventually use SynchronizedRandomAccessList actually, look at its implementation class, actually also is locked inside, inside it holds a List, with the synchronized keyword control to the List of read and write access, this is an idea — use of thread-safe collections, We can also use Vector and other similar classes to solve the problem.
Another way to do this is to manually lock key code segments, as we can also do
resultList.add(value);
Copy the code
Modified to
synchronized (mutex) {
resultList.add(value);
}
Copy the code
summary
Java 8 parallel flow provides a very convenient parallel processing, improve the efficiency of the program to write, we in the process of coding, on the use of multithreading should be vigilant, consciously prevent such problems.
Correspondingly, we do code review in the process, but also to involve the use of multiple threads of the scene stretched a string, in the code closed before the good, hidden hidden door.
reference
-
Thread safety – Wikipedia