ArrayList is one of the most commonly used data storage containers in development. The bottom layer is array implementation, we can store any type of data in the collection. ArrayList is not thread safe, good at random access to elements, slow insertion and deletion.
1. ArrayList data structure
The underlying data structure of an ArrayList is an array whose elements are of type Object. All operations on an ArrayList are based on arrays.
ArrayList is thread-unsafe
Adding elements to an ArrayList takes place in two steps:
- In the first
object[size]
To store the elements to be added; - will
size
Increases the value of.
Since this process is not guaranteed to be atomic in a multithreaded environment, ArrayList is thread-unsafe in a multithreaded environment.
In the single-threaded case, if Size= 0, add an element that is in position 0 and Size=1; If you have multiple threads, let’s say you have two threads, thread A puts the element at position 0 first. However, the CPU scheduling thread A pauses and thread B gets A chance to run. Thread B also adds elements to the ArrayList because Size is still equal to 0
(Note: We assume that adding an element takes two steps, and thread A only completes step 1), so thread B also stores the element at position 0. Then thread A and thread B continue to run, both incrementing Size. Ok, now let’s look at an ArrayList, where there’s really only one element, stored at position 0, and Size is equal to 2. This is called “thread unsafe”.
If you have to use ArrayList in a multithreaded environment, you need to make it thread-safe, and there are usually two solutions:
- Use synchronized keyword;
- The ArrayList can be called using the static method synchronizedList() in the Collections class.
3, ArrayList implementation
In the case of ArrayList, it implements the List interface and uses arrays to hold all elements underneath. The operations are basically operations on arrays. Let’s look at the source code for ArrayList:
1) Private attributes:
The ArrayList definition defines only two private properties of the class:
/** * The array buffer in which the elements of the ArrayList are stored. The size of the ArrayList is the size of the array buffer. When the first element is added, any empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA will be expanded to DEFAULT_CAPACITY (10). * /
transient Object[] elementData;
// The size of ArrayList
private int size;
Copy the code
Java’s serialization provides a mechanism for persisting object instances. When an object is persisted, there may be a special object data member that we do not want to use serialization mechanisms to store. To turn off serialization on a field for a particular object, you can prefix the field with the keyword TRANSIENT.
It’s a little abstract, but if you look at an example, it should make sense:
public class UserInfo implements Serializable {
private static final long serialVersionUID = 996890129747019948L;
private String name;
private transient String psw;
public UserInfo(String name, String psw) {
this.name = name;
this.psw = psw;
}
public String toString(a) {
return "name=" + name + ", psw="+ psw; }}public class TestTransient {
public static void main(String[] args) {
UserInfo userInfo = new UserInfo("Zhang"."123456");
System.out.println(userInfo);
try {
// serialize, attributes set to transient are not serialized
ObjectOutputStream o = new ObjectOutputStream(new FileOutputStream("UserInfo.out"));
o.writeObject(userInfo);
o.close();
} catch (Exception e) {
e.printStackTrace();
}
try {
// re-read the content
ObjectInputStream in = new ObjectInputStream(new FileInputStream("UserInfo.out"));
UserInfo readUserInfo = (UserInfo) in.readObject();
// PSW is null after reading
System.out.println(readUserInfo.toString());
} catch(Exception e) { e.printStackTrace(); }}}Copy the code
Properties marked transient are not saved when the object is serialized. Back to the analysis of ArrayList
2) Construction method:
ArrayList provides constructors that can construct an empty list with a default initial capacity of 10, an empty list with a specified initial capacity, and a list containing the elements of a specified collection in the order they are returned by the iterator of the collection.
/** * constructs an empty list with the specified initial capacity. * /
public ArrayList(int initialCapacity) {
if (initialCapacity > 0) {
this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
this.elementData = EMPTY_ELEMENTDATA;
} else {
throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}/** * construct an empty list with a default initial capacity of 10 */
public ArrayList(a) {
this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
/** * constructs a list of the specified collection elements, in the order returned by the collection iterators. * /
public ArrayList(Collection<? extends E> c) {
Object[] a = c.toArray();
if((size = a.length) ! =0) {
if (c.getClass() == ArrayList.class) {
elementData = a;
} else{ elementData = Arrays.copyOf(a, size, Object[].class); }}else {
// Replace with an empty array.elementData = EMPTY_ELEMENTDATA; }}Copy the code
3) Element storage:
ArrayList provides set(int index, E element), add(E E), add(int index, E element), addAll(Collection<? Extends E> c), addAll(int index, Collection<? Extends E> c) These methods of adding elements. Here we explain one by one:
// Replaces the element at the specified position in this list with the specified element, and returns the element previously at that position.
public E set(int index, E element) {
RangeCheck(index);
E oldValue = (E) elementData[index];
elementData[index] = element;
return oldValue;
}
// Adds the specified element to the end of the list.
public boolean add(E e) {
/ / modCount increase
ensureCapacity(size + 1);
elementData[size++] = e;
return true;
}
// Inserts the specified element at the specified position in this list.
// If there is an element in the current position, move the element at that position and all subsequent elements (their index plus 1) to the right.
public void add(int index, E element) {
// if (index > size || index < 0)
// throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + size);
rangeCheckForAdd(index);
// If the array length is insufficient, it will be expanded.
ensureCapacity(size + 1); / / modCount increase
// Add a size-index element to elementData starting at Index,
// Copy to a new elementData array starting at index+1.
// Move the current element and all subsequent elements right one place.
System.arraycopy(elementData, index, elementData, index + 1, size - index);
elementData[index] = element;
size++;
}
// Add all elements from the collection to the end of the list in the order of the elements returned by the iterator for the specified collection.
public boolean addAll(Collection<? extends E> c) {
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacity(size + numNew); // Increments modCount
System.arraycopy(a, 0, elementData, size, numNew);
size += numNew;
returnnumNew ! =0;
}
// Inserts all elements from the specified collection into this list, starting at the specified location.
public boolean addAll(int index, Collection<? extends E> c) {
if (index > size || index < 0)
throw new IndexOutOfBoundsException("Index: " + index + ", Size: " + size);
Object[] a = c.toArray();
int numNew = a.length;
ensureCapacity(size + numNew); / / modCount increase
int numMoved = size - index;
if (numMoved > 0)
System.arraycopy(elementData, index, elementData, index + numNew, numMoved);
System.arraycopy(a, 0, elementData, index, numNew);
size += numNew;
returnnumNew ! =0;
}
Copy the code
ArrayList is implemented based on arrays, and you can see arrays in properties. How do you implement it? For example, if the array is large, set the value to the specified element. What if the array is not large enough?
Add (E E) calls ensureCapacity(size+1), assigns the element’s index to elementData[size], and then increments size. ElementData [0]=e; elementData[0]=e; elementData[0]=e; Size = 1). Isn’t assigning an element’s index to elementData[size] an array out of bounds? So the key here is in Recapacity (size+1).
4) Element reading:
// Returns the element at the specified position in this list.
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
Copy the code
5) Element deletion:
ArrayList provides deletion by subscript or by specifying an object. As follows:
// Remove the element at the specified position in this list. Move any subsequent elements to the left (subtract one from their index).
public E remove(int index) {
rangeCheck(index);
modCount++;
E oldValue = elementData(index);
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index, numMoved);
elementData[--size] = null; // clear to let GC do its work
return oldValue;
}
Copy the code
The first step is to check the range, modify modCount, keep the element to be removed, move the element after the removed position forward one position, null the last element of the list, and return the removed element.
// Remove the first occurrence of the specified element (if present) from this list. If the list does not contain the element, it remains unchanged.
public boolean remove(Object o) {
if (o == null) {
for (int index = 0; index < size; index++)
if (elementData[index] == null) {
fastRemove(index);
return true; }}else {
for (int index = 0; index < size; index++)
if (o.equals(elementData[index])) {
fastRemove(index);
return true; }}return false;
}
// Private delete method that skips bounds checking and does not return the deleted value.
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index, numMoved);
elementData[--size] = null;
}
Copy the code
As you can see from the code first, this returns true if the removal is successful and false otherwise. Remove (Object O) iterates through the element to see if there is an incoming Object, and when it does, fastRemove is called to remove the Object. Why not remove the element by remove(index)? Because fastRemove skips determining the boundary, finding the element is equivalent to confirming that the index does not exceed the boundary, and fastRemove does not return the removed element.
// Remove all elements from this list whose index is in between
protected void removeRange(int fromIndex, int toIndex) {
modCount++;
int numMoved = size - toIndex;
System.arraycopy(elementData, toIndex, elementData, fromIndex, numMoved);
int newSize = size - (toIndex-fromIndex);
for (int i = newSize; i < size; i++) {
elementData[i] = null;
}
size = newSize;
}
Copy the code
This is done by moving the element elementData from the toIndex position forward to fromIndex, then leaving all elements after the toIndex position empty and changing the size.
This method is protected, and this method is protected. Why is this method defined as protected? Let’s look at this example
ArrayList<Integer> ints = new ArrayList<Integer>(Arrays.asList(0.1.2.3.4.5.6));
ints.subList(2.4).clear();
System.out.println(ints);
Copy the code
RemoveRange (int fromIndex,int toIndex)
6) Adjust array size ensureCapacity:
As you can see from the code that stores elements to the ArrayList above, every time you add an element to the array, you check to see if the number of added elements will exceed the size of the current array. If so, the array will be expanded to accommodate the need to add data. Array capacity expansion is implemented using a public method, ensureCapacity(int minCapacity). I can also use ensureCapacity to manually increase the capacity of an ArrayList instance to reduce the amount of incremental reallocation before actually adding a large number of elements.
public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = (oldCapacity * 3) / 2 + 1; A 50% increase in / / + 1
if (newCapacity < minCapacity)
newCapacity = minCapacity;
// Usually close to sizeelementData = Arrays.copyOf(elementData, newCapacity); }}Copy the code
As you can see from the above code, when the array is expanded, the elements of the old array are copied into the new array, each time the array grows by about 1.5 times its original capacity. This operation is expensive, so in practice, we should avoid expanding the array capacity. When we know how many elements we want to store, we specify the capacity when constructing an ArrayList instance to avoid array expansion. Or manually increase the capacity of the ArrayList instance by calling the ensureCapacity method, depending on actual needs.
// Why
oldData[]Object oldData[] = elementData;
Copy the code
At first glance, oldData is not used in the following sentence, which seems superfluous! But this is a class that involves memory management, so be aware of the internal issues. ElementData = array.copyof (elementData, newCapacity); The implementation of Array.copyof creates newCapacity memory and puts the old elementData in. It seems that oldData is not used either, what’s wrong with it? The problem is that the reference to the old memory is elementData, which refers to the new memory block. If there is a local oldData variable that refers to the old memory block, it is safer to copy the old memory because it proves that the old memory still has a reference. When allocating memory, it will not be invaded, and then after the copy is done, the lifetime of the local variable will be over, and then it will be safe to release. Otherwise, it would be a serious matter if new memory or other thread allocations encroach on the old memory while the copy is still running.
ArrayList also gives us the ability to adjust the capacity of the underlying array to the size of the actual elements held by the current list. It can be implemented using the trimToSize method. The code is as follows:
// Trim the capacity of this ArrayList instance to the current size of the list. You can use this action to minimize the storage space of an ArrayList instance.
public void trimToSize(a) {
modCount++;
if (size < elementData.length) {
elementData = (size == 0)? EMPTY_ELEMENTDATA : Arrays.copyOf(elementData, size); }}Copy the code
Since the length of elementData can be expanded, size refers to the number of elements it contains. Therefore, the size is very small but elementData.length is very large, and the space will be wasted. TrimToSize will return a new array to elementData with the same element content, length and size, saving space.
7) Convert toArray to static array
4. Notice the two toArray methods of ArrayList that convert to static arrays.
First, a call to array.copyof returns an array of size elements of elementData
ElementData elements from position 0 to size-1 into the new array and return.
// Returns an array containing all the elements of this list in the appropriate order.
public Object[] toArray() { returnArrays.copyOf(elementData, size); }Copy the code
Second, if the length of the passed array is less than size, return a new array of size and the same type as the passed array. ElementData is copied into the passed array whose length is equal to size and the passed array is returned. If the array is larger than size, in addition to copying elementData, the size-th element of the returned array is left blank.
/* Return an array containing all the elements of this list in the appropriate order; The runtime type of the returned array is the type of the specified array. Returns the list in the specified array if it fits. Otherwise, a new array is allocated using the specified array's runtime type and the size of this list. * /
public <T> T[] toArray(T[] a) {
if (a.length < size)
// Create a new array of run-time type
return (T[]) Arrays.copyOf(elementData, size, a.getClass());
System.arraycopy(elementData, 0, a, 0, size);
if (a.length > size)
a[size] = null;
return a;
}
Copy the code
Fail – Fast mechanism:
ArrayList also uses a fail-fast mechanism by logging modCount parameters. In the face of concurrent changes, iterators will soon fail completely, rather than risk arbitrary uncertain behavior at some unspecified time in the future
4, summarize
A few important points to make about the ArrayList source code:
-
Notice the three different constructors. An ArrayList constructed with no arguments has a default capacity of 10. A constructor with a Collection argument converts the Collection into an array assigned to the ArrayList’s implementation array elementData.
-
Note the method for scaling capacity, Enrecapacity. This method is called every time an ArrayList adds an element (which could be one or a group of elements) to ensure sufficient capacity. When the new capacity is not enough to hold the current number of elements, set the new capacity to 1.5 times plus 1 of the old capacity. If the new capacity is not enough, set the new capacity to the passed parameter (that is, the required capacity) and copy the elements into the new array using the array.copyof () method. As you can see, when the capacity is insufficient, it is very time-consuming to copy the original element into a new array every time you add an element. Therefore, it is recommended to use ArrayList only if the number of elements can be determined in advance. Otherwise, it is recommended to use LinkedList.
-
ArrayList’s implementation makes extensive calls to the array.copyof () and System.arrayCopy() methods. It is necessary to take a closer look at the implementation of these two methods.
Arrays.copyof() : arrays.copyof () : arrays.copyof () : arrays.copyof () : arrays.copyof () : arrays.copyof () : arrays.copyof () :
// Copies the specified array, truncating or populating it with null values (if necessary) to make the copy the specified length. For all indexes that are valid in both the original array and the replica, the two arrays will contain the same value. The replica will contain NULL for any index that is valid in the replica but not the original index. This type of index exists if and only if the specified length is greater than the length of the original array. // The generated array has the same class as the original array. public static <T> T[] copyOf(T[] original, int newLength) { return (T[]) copyOf(original, newLength, original.getClass()); } Copy the code
Another copyOf method is obviously called. This method takes three arguments, the last of which specifies the type of data to convert. The source code is as follows:
/ / newType class public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) { @SuppressWarnings("unchecked") T[] copy = ((Object)newType == (Object)Object[].class) ? (T[]) new Object[newLength] : (T[]) Array.newInstance(newType.getComponentType(), newLength); System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength)); return copy; } Copy the code
It is obvious here that the method is actually creating an array of length newLength inside it, called
The system.arrayCopy () method copies elements from the original array to the new one.
Now look at the system.arrayCopy () method. This method is tagged native and calls the system’s C/C++ code. It is not seen in the JDK, but the source code is available in the openJDK. This function actually finally calls the MEMmove () function of C language, so it can guarantee the correct copy and move of elements in the same array, which is much more efficient than the general copy method, and is suitable for batch processing of arrays. Java strongly recommends using this method when copying a large number of array elements for greater efficiency.
-
ArrayList is implemented based on arrays. It can directly find elements at specified locations through subscript indexes, so the search efficiency is high. However, each time an element is inserted or deleted, a large number of elements are moved, and the insertion and deletion efficiency is low.
-
In methods such as searching for a given element index value, the source code divides the value of the element into null and non-null. ArrayList allows elements to be null.