To introduce the ArrayList < T > class

ArrayList is a container class that can be used as if it is infinite. You can add, get, and remove it constantly. In fact, its internal implementation is based on arrays, and this article introduces its internal principles. After understanding the principle, we can configure it according to the actual situation when using, so that it has better performance and less memory footprint.

ArrayList class main field

    private static final long serialVersionUID = 8683452581122892189L;

    /**
     * Default initial capacity.
     */
    private static final int DEFAULT_CAPACITY = 10;

    /**
     * Shared empty array instance used for empty instances.
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * The array buffer into whichthe elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer. Any * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA * will be expanded to DEFAULT_CAPACITY when the first element is added. */ transient Object[] elementData; // non-private to simplify nested class access /** * The size of the ArrayList (the number of elements it contains). * *  @serial */ private int size; /** * The maximum size of array to allocate. * Some VMs reserve some header wordsin an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
Copy the code

From this field structure, we get the following information:

  • The ArrayList internally uses Object[] elementData to store the data we add.
  • The size field records the number of data stored in the current List.
  • The DEFAULT_CAPACITY field marks the length of the array allocated when the ArrayList is initialized by default.
  • EMPTY_ELEMENTDATA and DEFAULTCAPACITY_EMPTY_ELEMENTDATA are two empty arrays defined. Why use two? Here’s why.

ArrayList related method parsing

Let’s take a closer look at the source code in the same way we often do with ArrayList.

The constructor

    /**
     * Constructs an empty list with the specified initial capacity.
     *
     * @param  initialCapacity  the initial capacity of the list
     * @throws IllegalArgumentException if the specified initial capacity
     *         is negative
     */
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

    /**
     * Constructs an empty list with an initial capacity of ten.
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's * iterator. * * @param c the collection whose elements are to be placed into this list * @throws NullPointerException  if the specified collection is null'
     */
    public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if((size = elementData.length) ! = 0) { // c.toArray might (incorrectly) notreturn Object[] (see 6260652)
            if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, size, Object[].class); }else{ // replace with empty array. this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

There are three ArrayList constructors:

  • Our most common new ArrayList() simply points elementData to the default empty array (DEFAULTCAPACITY_EMPTY_ELEMENTDATA).
  • InitialCapacity ==0; initialCapacity==0; initialCapacity==0; initialCapacity==0; ElementData simply points to EMPTY_ELEMENTDATA.
  • The final constructor uses the Collection Collection to initialize the ArrayList. First convert the Collection to an array, and then perform different operations depending on size. If the converted array is not of type Object, create a new size array. Let’s take a step-by-step look at the source code of ArrayList in a common way.

The add method

    /**
     * Appends the specified element to the end of this list.
     *
     * @param e element to be appended to this list
     * @return <tt>true</tt> (as specified by {@link Collection#add})
     */
    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

    /**
     * Inserts the specified element at the specified position in this
     * list. Shifts the element currently at that position (if any) and
     * any subsequent elements to the right (adds one to their indices).
     *
     * @param index index at which the specified element is to be inserted
     * @param element element to be inserted
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public void add(int index, E element) {
        rangeCheckForAdd(index);

        ensureCapacityInternal(size + 1);  // Increments modCount!!
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }
Copy the code

As we saw above, the add(E E) method adds elements directly to the end of the array. The add(int index,E element) method adds the element to the specified index position, but of course the element after the original index needs to be moved one place back. We can see the enrecapacityInternal method in the source code above. Keep looking at the source code:

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }
        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
Copy the code

Case 1: Let’s look at the Enrecapact Internal method first. Step 1: If elementData points to DEFAULTCAPACITY_EMPTY_ELEMENTDATA, expand the array to DEFAULT_CAPACITY(10). From this step we know that the condition elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA is only true if we use new ArrayList(). Step 2: According to the process ensureCapacityInternal–>ensureExplicitCapacity–>grow, we know that ArrayList creates an array of length 10 in this case. Case 2: When analyzing the ArrayList constructor above, elementData also points to the EMPTY_ELEMENTDATA empty array. Only new ArrayList(0) or New ArrayList(empty collection) will be true. In this case, according to the ensureCapacityInternal–>ensureExplicitCapacity–>grow process, the allocated array is small, which is a conservative memory allocation strategy. Summary: From the above analysis, we know the different uses of DEFAULTCAPACITY_EMPTY_ELEMENTDATA and EMPTY_ELEMENTDATA. The former creates an array of 10 elements by default, and then expands from there. The latter is a conservative memory allocation policy, and data expansion is slow.

Grow method parsing

The grow method is the core of the ArrayList expansion.

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }
Copy the code

We see that ArrayList defaults to 1.5 times its original size. The following logic is to judge some boundary cases in turn:

  • If newCapacity is less than minCapacity, resize the array to minCapacity.
  • If minCapacity is between MAX_ARRAY_SIZE and integer. MAX_VALUE, the new array allocates integer. MAX_VALUE, otherwise MAX_ARRAY_SIZE is allocated.

The contains, indexOf lastIndexOf related retrieval method

    /**
     * Returns <tt>true</tt> if this list contains the specified element.
     * More formally, returns <tt>true</tt> if and only ifthis list contains * at least one element <tt>e</tt> such that * <tt>(o==null&nbsp; ? &nbsp; e==null&nbsp; :&nbsp; o.equals(e))</tt>. * * @param o element whose presencein this list is to be tested
     * @return <tt>true</tt> if this list contains the specified element
     */
    public boolean contains(Object o) {
        return indexOf(o) >= 0;
    }

    /**
     * Returns the index of the first occurrence of the specified element
     * in this list, or -1 ifthis list does not contain the element. * More formally, returns the lowest index <tt>i</tt> such that * <tt>(o==null&nbsp; ? &nbsp; get(i)==null&nbsp; :&nbsp; o.equals(get(i)))</tt>, * or -1if there is no such index.
     */
    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if (o.equals(elementData[i]))
                    return i;
        }
        return- 1; } /** * Returns the index of the last occurrence of the specified element *in this list, or -1 ifthis list does not contain the element. * More formally, returns the highest index <tt>i</tt> such that * <tt>(o==null&nbsp; ? &nbsp; get(i)==null&nbsp; :&nbsp; o.equals(get(i)))</tt>, * or -1if there is no such index.
     */
    public int lastIndexOf(Object o) {
        if (o == null) {
            for (int i = size-1; i >= 0; i--)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = size-1; i >= 0; i--)
                if (o.equals(elementData[i]))
                    return i;
        }
        return- 1; }Copy the code

The logic of these methods is pretty clear, just loop through and find the elements that meet the criteria.

ToArray method

This is a method that we’ll use sometimes, which is to convert an ArrayList into an array. Let’s look at the source code below:

public Object[] toArray() {
    return Arrays.copyOf(elementData, size);
}

@SuppressWarnings("unchecked")
public <T> T[] toArray(T[] a) {
    if (a.length < size)
        // Make a new array of a's runtime type, but my contents:'
        return (T[]) Arrays.copyOf(elementData, size, a.getClass());
    System.arraycopy(elementData, 0, a, 0, size);
    if (a.length > size)
        a[size] = null;
    return a;
}
Copy the code

ArrayList copies the data from the elementData array to the new array using the array. copyOf and System. arrayCopy methods, and then returns it. The System. Arraycopy methods are very efficient and use C/C++ internally (assembly is used for setup), so we should use these methods when we are developing arrays.

Get/set methods

@SuppressWarnings("unchecked")
E elementData(int index) {
    return (E) elementData[index];
}

/**
 * Returns the element at the specified position in this list.
 *
 * @param  index index of the element to return
 * @return the element at the specified position in this list
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E get(int index) {
    rangeCheck(index);

    return elementData(index);
}

/**
 * Replaces the element at the specified position in this list with
 * the specified element.
 *
 * @param index index of the element to replace
 * @param element element to be stored at the specified position
 * @return the element previously at the specified position
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E set(int index, E element) {
    rangeCheck(index);

    E oldValue = elementData(index);
    elementData[index] = element;
    return oldValue;
}
Copy the code

The logic is relatively simple and will not be described in detail.

The remove method

    /**
     * Removes the element at the specified position inthis list. * Shifts any subsequent elements to the left (subtracts one from their * indices). * * @param index the index  of the element to be removed * @return the element that was removed from the list
     * @throws IndexOutOfBoundsException {@inheritDoc}
     */
    public E remove(int index) {
        rangeCheck(index);

        modCount++;
        E oldValue = elementData(index);

        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work

        return oldValue;
    }

    /**
     * Removes the first occurrence of the specified element from this list,
     * ifit is present. If the list does not contain the element, it is * unchanged. More formally, removes the element with the lowest index * <tt>i</tt> such that * <tt>(o==null&nbsp; ? &nbsp; get(i)==null&nbsp; :&nbsp; o.equals(get(i)))</tt> * (if such an element exists).  Returns <tt>true</tt> if this list
     * contained the specified element (or equivalently, if this list
     * changed as a result of the call).
     *
     * @param o element to be removed from this list, if present
     * @return <tt>true</tt> if this list contained the specified element
     */
    public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    fastRemove(index);
                    return true; }}else {
            for (int index = 0; index < size; index++)
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true; }}return false;
    }

    /*
     * Private remove method that skips bounds checking and does not
     * return the value removed.
     */
    private void fastRemove(int index) {
        modCount++;
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
    }
Copy the code

The logic is fairly clear, except that after remove(index)/ or remove(object), you need to call System. arrayCopy to efficiently move the array behind index so that it can fill the position.

RemoveAll/retainAll method

Some friends may not have used these two methods, let’s see what these two methods are, see the code:

ArrayList<String> list=new ArrayList<>();
list.add("A");
list.add("B");
list.add("C");
list.add("D");
list.add("E");
list.add("F");

ArrayList<String> list1=new ArrayList<>();
list1.add("C");
list1.add("D"); //list.removeAll(list1); //[A, B, E, F] System.out.println(list); list.retainAll(list1); / / / C, D System. Out. The println (list);Copy the code

From the results, we can see that the removeAll method calculates the difference set of two sets, while retainAll calculates the intersection of two sets. Here we use the source code to analyze:

public boolean removeAll(Collection<? > c) { Objects.requireNonNull(c);return batchRemove(c, false); } public boolean retainAll(Collection<? > c) { Objects.requireNonNull(c);return batchRemove(c, true); } private boolean batchRemove(Collection<? > c, boolean complement) { final Object[] elementData = this.elementData; int r = 0, w = 0; boolean modified =false;
    try {
        for (; r < size; r++)
            if (c.contains(elementData[r]) == complement)
                elementData[w++] = elementData[r];
    } finally {
        // Preserve behavioral compatibility with AbstractCollection,
        // even if c.contains() throws.
        if(r ! = size) { System.arraycopy(elementData, r, elementData, w, size - r); w += size - r; }if(w ! = size) { // clear tolet GC do its work
            for (int i = w; i < size; i++)
                elementData[i] = null;
            modCount += size - w;
            size = w;
            modified = true; }}return modified;
}
Copy the code

As you can see, the design here is a little bit more subtle, with a Boolean value, combining the methods of taking the difference and taking the intersection into one method. BatchRemove method code design implementation is still very good, the core logic is try statement block for loop, finally statement block is mainly data copy and special value processing.

The iterator

Java collections are designed to support iterators. Let’s look at the relevant parts of the iterator in the ArrayList.

Gets the iterator object

/**
 * Returns an iterator over the elements in this list in proper sequence.
 *
 * <p>The returned iterator is <a href="#fail-fast"><i>fail-fast</i></a>.
 *
 * @return an iterator over the elements in this list in proper sequence
 */
public Iterator<E> iterator() {
    return new Itr();
}
Copy the code

If you are familiar with Java collection hierarchies (class inheritance structures), you may know that AbstractList, the base class of ArrayList, contains an inner Itr. Now ArrayList has re-implemented an optimized version of the Itr class. Let’s look at the source code:

    /**
     * An optimized version of AbstractList.Itr
     */
    private class Itr implements Iterator<E> {
        int cursor;       // index of next element to returnint lastRet = -1; // index of last element returned; - 1if no such
        int expectedModCount = modCount;

        public boolean hasNext() {
            returncursor ! = size; } @SuppressWarnings("unchecked")
        public E next() {
            checkForComodification();
            int i = cursor;
            if (i >= size)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i + 1;
            return (E) elementData[lastRet = i];
        }

        public void remove() {
            if (lastRet < 0)
                throw new IllegalStateException();
            checkForComodification();

            try {
                ArrayList.this.remove(lastRet);
                cursor = lastRet;
                lastRet = -1;
                expectedModCount = modCount;
            } catch (IndexOutOfBoundsException ex) {
                throw new ConcurrentModificationException();
            }
        }

        @Override
        @SuppressWarnings("unchecked")
        public void forEachRemaining(Consumer<? super E> consumer) {
            Objects.requireNonNull(consumer);
            final int size = ArrayList.this.size;
            int i = cursor;
            if (i >= size) {
                return;
            }
            final Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length) {
                throw new ConcurrentModificationException();
            }
            while(i ! = size && modCount == expectedModCount) { consumer.accept((E) elementData[i++]); } // update once at end of iteration to reduce heap write traffic cursor = i; lastRet = i - 1; checkForComodification(); } final voidcheckForComodification() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
        }
    }
Copy the code

From the source code above, iterators provide only basic backward traversal, deletion, and so on. This allows us to iterate through the ArrayList using iterators (of course, for(E E: elements) is automatically translated into iterator calls at compile time).

ListItr iterator

Moving on to the source code of ArrayList, we can see that The iterator of ListItr is also implemented inside ArrayList. In addition to providing backward traversal, this iterator also provides forward traversal, increment, set, and so on. Is a full-featured iterator implementation. Let’s look at the source code:

    /**
     * An optimized version of AbstractList.ListItr
     */
    private class ListItr extends Itr implements ListIterator<E> {
        ListItr(int index) {
            super();
            cursor = index;
        }

        public boolean hasPrevious() {
            returncursor ! = 0; } public intnextIndex() {
            return cursor;
        }

        public int previousIndex() {
            return cursor - 1;
        }

        @SuppressWarnings("unchecked")
        public E previous() {
            checkForComodification();
            int i = cursor - 1;
            if (i < 0)
                throw new NoSuchElementException();
            Object[] elementData = ArrayList.this.elementData;
            if (i >= elementData.length)
                throw new ConcurrentModificationException();
            cursor = i;
            return (E) elementData[lastRet = i];
        }

        public void set(E e) {
            if(lastRet < 0) throw new IllegalStateException(); checkForComodification(); try { ArrayList.this.set(lastRet, e); } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } public void add(E e) { checkForComodification(); try { int i = cursor; ArrayList.this.add(i, e); cursor = i + 1; lastRet = -1; expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); }}}Copy the code

We can see that ListItr is an iterator where many of the operations are directly called methods of the ArrayList class. It’s just a wrapper.

ConcurrentModificationException unusual reason and solution

Some of you may have encountered this exception when iterating through an ArrayList collection, and this exception is caused by a quick failure mechanism designed by the ArrayList class, which says that the collection has been modified while iterating through. This exception occurs in the following example:

ArrayList<String> list=new ArrayList<>();
list.add("A");
list.add("B");
list.add("C");
list.add("D");
list.add("E");
list.add("F");

for(String item : list) { System.out.println(item); list.remove(item); } // or /* Iterator<String> Iterator =list.iterator();while (iterator.hasNext()) {
	String string = (String) iterator.next();
	System.out.println(string);
	list.remove(string);
}
*/

Exception in thread "main" java.util.ConcurrentModificationException
	at java.util.ArrayList$Itr.checkForComodification(Unknown Source)
	at java.util.ArrayList$Itr.next(Unknown Source)
	at com.learn.example.RunMain.main(RunMain.java:42)
Copy the code

The following two traversal methods can cause exceptions, and let’s see why. The first foreach loop is an iterator when compiled. Let’s go straight to the iterator.

@SuppressWarnings("unchecked")
public E next() {
    checkForComodification();
    int i = cursor;
    if (i >= size)
        throw new NoSuchElementException();
    Object[] elementData = ArrayList.this.elementData;
    if (i >= elementData.length)
        throw new ConcurrentModificationException();
    cursor = i + 1;
    return (E) elementData[lastRet = i];
}

final void checkForComodification() {
    if(modCount ! = expectedModCount) throw new ConcurrentModificationException(); }Copy the code

We see that the first line of the next() method calls the checkForComodification() method, and we see that the exception is thrown if modCount and expectedModCount are not equal. When we look at the Itr source code above, we see that the initial assignment of expectedModCount is the modCount variable in the ArrayList class. Remove (o) will have modCount++ in it. The source code is as follows:

/*
 * Private remove method that skips bounds checking and does not
 * return the value removed.
 */
private void fastRemove(int index) {
    modCount++;
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index,
                         numMoved);
    elementData[--size] = null; // clear to let GC do its work
}
Copy the code

This causes the expectedModCount inside the Itr to be inconsistent with the modCount of the ArrayList, thus throwing the exception. So how do we solve this? Just leave modCount unincremented and synchronize its value with expectedModCount. There is also a remove method inside the Itr, which we can call.

Iterator<String> iterator=list.iterator();
while (iterator.hasNext()) {
	String string = (String) iterator.next();
	System.out.println(string);
	iterator.remove();
}
Copy the code

conclusion

  • An ArrayList contains an array of Object[], which is then dynamically expanded when used.
  • An ArrayList allocates an array of 10 elements by default, and then expands it, each time by 1.5 times the size of the original array. If you have a rough idea of the capacity of the collection, you can specify an initial value to reduce the performance cost of capacity expansion.
  • Arraylists are suitable for scenarios where there are many queries but few operations (because after the operation, the array must be moved in most cases).
  • ArrayList collections are not thread-safe, requiring locking or using concurrency-safe containers in multithreaded environments.