"Must know must be" the most detailed analysis of the ArrayList principle

This is the first day of my participation in the August Challenge. For details, see:August is more challenging

From today also officially open JDK principle analysis pit, in fact, the purpose of writing source analysis is no longer as before to understand the principle, more important is to see their coding style to further experience their design ideas. Look at the source code before their own implementation of a rematch may have different harvest!

1. Summary of structure

First of all, we need to get a sense of what an ArrayList looks like in terms of its structure.

1. The inheritance

This class inherits from AbstractList

2. Implement

This class implements many interfaces, as follows:

First of all, this class is a List and naturally there’s a List interface
Then because this class needs RandomAccess, the so-called RandomAccess is any access with a subscript, so the implementation of RandomAccess
And then there are the two interfaces that the collections framework will definitely implement Cloneable, Serializable and we’ll talk more about serialization in a second

3. Main fields

    // The default size is 10
    private static final int DEFAULT_CAPACITY = 10;
    / / an empty array
    private static final Object[] EMPTY_ELEMENTDATA = {};
    // The default is an empty array. This is what the constructor will call later in the add method
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
    // To store elements in the ArrayList. Note that the modifier is a transient
    transient Object[] elementData; 
    / / size
    private int size;
Copy the code

4. Main methods

The following methods are numbered to indicate overloaded methods

ctor-3
get
set
add-2
remove-2
clear
addAll
write/readObject
Fast – fail mechanism
subList
iterator
forEach
sort
removeIf

2. Analysis of construction methods

1. Constructors with no parameters

The only operation is to set elementData to DEFAULTCAPACITY_EMPTY_ELEMENTDATA, which is an empty array.

// The constructor that takes no arguments, passing an empty array, will create an array of size 10, as shown in add
    public ArrayList(a) {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }
Copy the code

2. Pass in the construction of the array size

So this is new an array, and if the array size is zero, it’s assigned EMPTY_ELEMENTDATA

// Create a new underlying array with the arguments passed in
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); }}Copy the code

3. Pass the Collection interface

If the size of the interface is 0, then pass EMPTY_ELEMENTDATA as shown in the above method

public ArrayList(Collection<? extends E> c) {
        elementData = c.toArray();
        if((size = elementData.length) ! =0) {
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            // There is a bug in the JDK: an array of type Object does not necessarily hold an Object of type Object, and may throw an exception
            // An array of type Object might point to an array of its subclasses
            if(elementData.getClass() ! = Object[].class)// This operation is to first new the array and then call system.arrayCopy to copy the value. That's creating a new array
                elementData = Arrays.copyOf(elementData, size, Object[].class);
        } else {
            // If an empty array is passed in, initialize with an empty array
            this.elementData = EMPTY_ELEMENTDATA; }}Copy the code

But notice that there’s a JDK bug here which is that an array of type Object doesn’t necessarily hold an Object of type Object, it throws an exception, mainly because an array of type Object might point to an array of its subclass, An error will be reported for storing objects of type Object. To test this bug, I wrote a few lines of code to test it out. This test is a failure, and that’s why it exists.

A typical example of this would be if we were to create a list of strings and then call toArray and find that it returns a string[] and of course we can’t store any elements anywhere.

class A{}class B extends A {}public class JDKBug {

    @Test
    public void test1(a) {
        B[] arrB = new B[10];
        A[] arrA = arrB;
        arrA[0] =newA(); }}Copy the code

3. Analysis of modification methods

1. Set method

This method is also very simple, first to determine the range, and then directly update the subscript.

    // Set the new value to return the old value
    public E set(int index, E element) {
        rangeCheck(index);

        E oldValue = elementData(index);
        elementData[index] = element;
        return oldValue;
    }
Copy the code

2. Add(E E) method

So this method first calls ensureCapacityInternal() which determines if the current elementData is equal to DEFAULTCAPACITY_EMPTY_ELEMENTDATA and if so, I’m going to set the size of the array to 10 and then I’m going to expand it, and this just explains why the List with no parameters is 10, and the method I’m going to expand it is called ensureExplicitCapacity and one of the things I’m going to do is expand it if the user specifies a size larger than the current size, Arraycopy method is implemented by creating a new array, then calling System. arrayCopy to copy the array, and finally returning the new array.

    public boolean add(E e) {
        // When calling no-argument constructs, set the size to 10
        ensureCapacityInternal(size + 1);  // Increments modCount        
        elementData[size++] = e;
        return true;
    }

    private void ensureCapacityInternal(int minCapacity) {
        // If the current array is empty by default, set it to the minimum value in 10 and size+1
        // The size of the new array is 10
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // If the specified minimum capacity is greater than the specified minimum capacity, the specified minimum capacity prevails. Otherwise, the value is 10
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }
    
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        // 1.5 times increase
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
Copy the code

3. Add(int index, E E

This method is relatively simple. It’s basically the same as above, except that when you put the element in the end, you do a different thing. Instead, you use System.arrayCopy to copy from itself to itself, in order to overwrite the element. Notice the rule that most of the operations that involve subscripts are not handwritten for loops but copying and overwriting things like that. It’s kind of a trick.

public void add(int index, E element) {
        rangeCheckForAdd(index);
        ensureCapacityInternal(size + 1);  // Increments modCount
        / / cover
        System.arraycopy(elementData, index, elementData, index + 1,
                         size - index);
        elementData[index] = element;
        size++;
    }
Copy the code

4. remove(int index)

Again, there’s a copy-overwrite technique here. One thing to note, however, is that unused nodes need to manually trigger the GC, which is an example the authors use in Efftive Java.

public E remove(int index) {
        rangeCheck(index);
        modCount++;
        E oldValue = elementData(index);
        int numMoved = size - index - 1;
        / / cover
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
                             numMoved);
        elementData[--size] = null; // clear to let GC do its work
        return oldValue;
    }

Copy the code

5. remove(E e)

So this is obviously going to determine if e is null. If it’s null, it’s going to compare it to equals. Otherwise, it’s going to call equals and do the copy override.

public boolean remove(Object o) {
        if (o == null) {
            for (int index = 0; index < size; index++)
                if (elementData[index] == null) {
                    / / cover
                    fastRemove(index);
                    return true; }}else {
            for (int index = 0; index < size; index++)
                // Call equals
                if (o.equals(elementData[index])) {
                    fastRemove(index);
                    return true; }}return false;
    }
Copy the code

6. clear()

One thing this method does is set all the references in the array to NULL for GC purposes. Instead of just setting size to 0.

// Gc all nodes
    public void clear(a) {
        modCount++;
        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;
        size = 0;
    }
Copy the code

7. addAll(Collection e)

There’s nothing to say about this except that I’m going to turn the array and copy it

    // All methods that involve the Collection interface convert the interface to an array and operate on the array
    public boolean addAll(Collection<? extends E> c) {
        Object[] a = c.toArray();
        int numNew = a.length;
        ensureCapacityInternal(size + numNew);  // Increments modCount
        System.arraycopy(a, 0, elementData, size, numNew);
        size += numNew;
        returnnumNew ! =0;
    }
Copy the code

4. Access method analysis

1. get

Access the array index directly.

    // There's nothing to say
    public E get(int index) {
        rangeCheck(index);
        return elementData(index);
    }
    
Copy the code

2. subList

The interesting implementation of this method is that instead of intercepting a new List and returning it, it has an inner class of subList inside the class, which then records the beginning and end subscripts of subList, and returns the subList object. So you might be wondering if the subList that’s returned is not a List, but the subList here is inherited from AbstractList so it’s still correct.

public List<E> subList(int fromIndex, int toIndex) {
        subListRangeCheck(fromIndex, toIndex, size);
        return new SubList(this.0, fromIndex, toIndex);
    }
    // subList returns an instance of a position tag, that is, a number of flags are placed on the original array, without modifying or copying the new space
private class SubList extends AbstractList<E> implements RandomAccess {
        private final AbstractList<E> parent;
        private final int parentOffset;
        private final int offset;
        int size;
        // other functions .....
     }

Copy the code

5. Tools

1. write/readObject

I noted earlier when I introduced the data field that elementData is a transition variable, which means that it will be ignored during automatic serialization.

Then we found the write/readObject methods in the source code, which are used to serialize each element in elementData, that is, to serialize and deserialize the field manually. Isn’t that unnecessary?

If you want to serialize the fields of the ArrayList (i.e., elementData), why do you want to decorate elementData with transient?

Recall the automatic expansion mechanism of ArrayList. The elementData array acts as a container. When the container is insufficient, the capacity of the container is expanded, but the capacity of the container is usually greater than or equal to the number of elements in the ArrayList.

For example, now that there are actually eight elements, the size of the elementData array might be 8X1.5 =12. If you serialize the elementData array directly, you would waste the space of four elements, especially if the number of elements is very large, which is very unprofitable.

So the designers of ArrayList designed elementData to be transient, then serialized it manually in the writeObject method, and only serialized those elements that were actually stored, not the entire array.

private void writeObject(java.io.ObjectOutputStream s)
        throws java.io.IOException{
        // Write out element count, and any hidden stuff
        int expectedModCount = modCount;
        s.defaultWriteObject();
        // Write out size as capacity for behavioural compatibility with clone()
        s.writeInt(size);
        // Write out all elements in the proper order.
        for (int i=0; i<size; i++) {
            s.writeObject(elementData[i]);
        }
        if(modCount ! = expectedModCount) {throw newConcurrentModificationException(); }}Copy the code

2. fast-fail

The so-called fast-fail is that we are not allowed to call the methods of the Collection interface to modify the container during the iterator traversal, otherwise an exception will be thrown. The mechanism of this implementation is to maintain two variables in the iterator, They are modCount and expectedModCount respectively. Because the methods of the Collection interface will modCount++ every time the modification operation is performed, an exception will be thrown if they are detected to be unequal in the iterator.

private class Itr implements Iterator<E> {
        int cursor;       // index of next element to return
        int lastRet = -1; // index of last element returned; -1 if no such
        int expectedModCount = modCount;
        
        final void checkForComodification(a) {
            if(modCount ! = expectedModCount)throw newConcurrentModificationException(); }}Copy the code

3. forEach

This is a functional programming method. Take a look at its argument forEach(Consumer
Action) and it’s interesting that the accept is a functional interface. We have a callback to the Consumer’s accept so we only need to pass in a function interface to handle each element.

    @Override
    public void forEach(Consumer<? super E> action) {
        Objects.requireNonNull(action);
        final int expectedModCount = modCount;
        @SuppressWarnings("unchecked")
        final E[] elementData = (E[]) this.elementData;
        final int size = this.size;
        for (int i=0; modCount == expectedModCount && i < size; i++) {
            / / callback
            action.accept(elementData[i]);
        }
        if(modCount ! = expectedModCount) {throw newConcurrentModificationException(); }}Copy the code

I wrote a test code, but this method is not commonly used, the main reason is that Collection can generate its own Stream object, and then call the above method. Let me mention it here.

public class ArrayListTest {

    @Test
    public void foreach(a) {
        ArrayList<Integer> list = new ArrayList<>();
        list.add(2);
        list.add(1);
        list.add(4);
        list.add(6);
        list.forEach(System.out::print);  // Print each element.}}Copy the code

4. sort

The underlying call to the arrays.sort method doesn’t make any sense.

public void sort(Comparator<? super E> c) {
        final int expectedModCount = modCount;
        Arrays.sort((E[]) elementData, 0, size, c);
        if(modCount ! = expectedModCount) {throw new ConcurrentModificationException();
        }
        modCount++;
    }
Copy the code

5. removeIf

This is similar to forEach, except that the callback is written.

6. Compare with Vector

This is basically an introduction to the important methods and properties of ArrayList, and we have a better understanding of its underlying implementation and data structure. And then we have arrayLists and we have an old Vector that looks a lot like ArrayLists. Because you’ll find that the inheritance and implementation interfaces are the same. However, there are some differences, which are described below.

Almost all methods in a Vector are synchronized methods, so it is a thread-safe ArrayList
The constructor is different. There are no two special constants in the property, so its constructor directly initializes an array of size 10. And then he has four constructions.
Different interfaces are traversed. He still has an iterator, but his previous method of traversing was the Enumeration interface, getting the Enumeration from elements and then using hasMoreElements and nextElement to get the element.
Some functional programming methods are missing.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

“Must know must be” the most detailed analysis of the ArrayList principle

1. Summary of structure

1. The inheritance

2. Implement

3. Main fields

4. Main methods

2. Analysis of construction methods

1. Constructors with no parameters

2. Pass in the construction of the array size

3. Pass the Collection interface

3. Analysis of modification methods

1. Set method

2. Add(E E) method

3. Add(int index, E E

4. remove(int index)

5. remove(E e)

6. clear()

7. addAll(Collection e)

4. Access method analysis

1. get

2. subList

5. Tools

1. write/readObject

2. fast-fail

3. forEach

4. sort

5. removeIf

6. Compare with Vector

“Must know must be” the most detailed analysis of the ArrayList principle

1. Summary of structure

1. The inheritance

2. Implement

3. Main fields

4. Main methods

2. Analysis of construction methods

1. Constructors with no parameters

2. Pass in the construction of the array size

3. Pass the Collection interface

3. Analysis of modification methods

1. Set method

2. Add(E E) method

3. Add(int index, E E

4. remove(int index)

5. remove(E e)

6. clear()

7. addAll(Collection e)

4. Access method analysis

1. get

2. subList

5. Tools

1. write/readObject

2. fast-fail

3. forEach

4. sort

5. removeIf

6. Compare with Vector

Related Posts

In-depth understanding of the browser kernel – Overview

In-depth understanding of CAS issues and ABA issues

Python OpenCV image scaling cv2.resize method