Java List container source code analysis supplement

We learned how to use ArrayList and LinkedList by analyzing the source code. However, after analyzing the source code, I could not avoid searching some relevant materials on the Internet. Standing on the shoulders of predecessors, I found that there were more or less missing places in the previous two articles. For example, Vector, which is similar to ArrayList, was not mentioned. And try to summarize the similarities and differences of implementation class members in the List family.

  1. Vector introduces and differs from ArrayList
  2. Differences between ArrayList and LinkedList
  3. Stack class introduction and implementation of a simple Stack
  4. Difference between SynchronizedList and Vector

The Vector is introduced

Vector is a fairly old Java container class, dating from JDK 1.0 and modified in JDK 1.2 to implement List and Collection. In effect, a Vector is similar to an ArrayList in that it maintains an array of dynamically varying lengths internally. But their capacity expansion mechanisms are different. Most of the source code for Vector is similar to that of ArrayList. Here is a brief look at the constructor of Vector and the expansion mechanism for Vector.

The constructor of a Vector can specify the initial capacity and expansion factor of the internal array. The default initial capacity is 10 if you do not specify the initial capacity. But unlike ArrayList, it allocates 10 memory when it is created. ArrayList, on the other hand, generates an array of capacity 10 the first time add is called.

public Vector() { this(10); // Create an array of capacity 10. } public Vector(int initialCapacity) { this(initialCapacity, 0); } public Vector(int initialCapacity, int capacityIncrement) { super();if (initialCapacity < 0)
       throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); this.elementData = new Object[initialCapacity]; this.capacityIncrement = capacityIncrement; } // Public Vector(Collection<? extends E> c) { elementData = c.toArray(); ElementCount = elementData.length; elementCount = elementData.length; // c.toArray might (incorrectly) notreturn Object[] (see 6260652)
   if(elementData.getClass() ! = Object[].class) elementData = Arrays.copyOf(elementData, elementCount, Object[].class); }Copy the code

To expand Vector, we need to look at the internal grow method source:

private void grow(int minCapacity) { // overflow-conscious code int oldCapacity = elementData.length; // newCapacity = 2 * oldCapacity Int newCapacity = oldCapacity + ((capacityIncrement > 0)? capacityIncrement : oldCapacity);if (newCapacity - minCapacity < 0)
       newCapacity = minCapacity;
   if (newCapacity - MAX_ARRAY_SIZE > 0)
       newCapacity = hugeCapacity(minCapacity);
   elementData = Arrays.copyOf(elementData, newCapacity);
}
Copy the code

By combining the above method with our constructor, we can know that when the Vector needs to be expanded, the capacityIncrement will be judged first, that is, when the Vector is constructed, the capacityIncrement coefficient is specified. If the capacityIncrement coefficient is specified, the capacity of the Vector will be expanded according to the specified coefficient. The new capacity is oldCapacity + capacityIncrement. If capacityIncrement is not specified, the capacity is doubled by default, which is different from the 0.5 times length of ArrayList.

The most important difference between Vector and ArrayList is that all of Vector’s methods for accessing internal arrays have synchronized, which means that Vector is thread-safe, whereas ArrayList does not.

For vectors, in addition to the for loop, advanced for loop, and iterative iteration methods, elements() can be called to return an Enumeration.

Enumeration is an interface that has only two methods, hasMoreElements and nextElement. It looks like an iterator, but there is no iterator add remove.

public interface Enumeration<E> { boolean hasMoreElements(); E nextElement(); } // Vector elements. public Enumeration<E>elements() {
   return new Enumeration<E>() {
       int count = 0;

       public boolean hasMoreElements() {
           return count < elementCount;
       }

       public E nextElement() {
           synchronized (Vector.this) {
               if (count < elementCount) {
                   return elementData(count++);
               }
           }
           throw new NoSuchElementException("Vector Enumeration"); }}; }Copy the code

Usage:

Vector<String> vector = new Vector<>();
vector.add("1");
vector.add("2");
vector.add("3");

Enumeration<String> elements = vector.elements();

while (elements.hasMoreElements()){
  System.out.print(elements.nextElement() + "");
}
Copy the code

In fact, this interface is very old. To accommodate older versions of the JDK, we could call things like Enumeration

Enumeration = collections.enumeration (list); To return an Enumeration. The principle is to call the corresponding iterator method.

Public static <T> enumeration <T> enumeration(final Collection<T> c) {returnNew Enumeration<T>() {private final Iterator<T> I = c.iterator(); // Call iterator hasNext public BooleanhasMoreElements() {
           returni.hasNext(); } // Call iterator next public TnextElement() {
           returni.next(); }}; }Copy the code

Vector compared to ArrayList

  1. VectorArrayListAt the bottom are array data structures that maintain a dynamically sized array.
  2. VectorThe expansion mechanism defaults to double the size of the existing array when no expansion factor is specified by construction. whileArrayListIncrease the length of the existing array by half.
  3. VectorIs thread-safe, whileArrayListIt is not thread-safe when multi-threading is not involvedArrayListthanVectorHigh efficiency
  4. For vectors, in addition to the for loop, advanced for loop, and iterator methods, we can call elements() to return an Enumeration through the inner elements.

Differences between ArrayList and LinkedList

Let’s take a look at the differences in the inheritance of the two lists:

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
Copy the code
public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
Copy the code

As you can see, LinkedList does not implement the RandomAccess interface, which we know is an empty token interface that marks the implementation class for random fast access. It’s worth rethinking this interface, according to RandomAccess’s Java API:

Public interface RandomAccess tag interfaces are used for List implementations to indicate that they support fast (usually constant time) RandomAccess. The main purpose of this interface is to allow generic algorithms to change their behavior to provide good performance when applied to random or sequential access lists.

We can realize that the distinction between random and sequential access is often fuzzy. For example, if the List is large, some List implementations that provide progressive access times, but are actually fixed access times, should generally implement this interface. As a rule of thumb, if for a typical class instance, the List implementation should implement this interface:

 for(int I = 0, n = list.size (); i <n; I ++) list.get (ⅰ);Copy the code

Faster than this loop:

forIterator I = list.iterator (); I.hasnext ();) I.n ext ();Copy the code

As a rule of thumb, if for iterates through a List implementation class faster than the iterator, we need to implement RandomAccess, indicating that the container supports RandomAccess. From this theory we can guess that LinkedList does not have random fast access, in other words LinkedList for loop traversal is slower than iterator traversal. Let’s test it out:

 private static void loopList(List<Integer> list) {
   long startTime = System.currentTimeMillis();

   for (int i = 0; i < list.size(); i++) {
       list.get(i);
   }

   System.out.println(list.getClass().getSimpleName() + "Use the normal for loop to traverse between" +
           (System.currentTimeMillis() - startTime) + "ms");

   startTime = System.currentTimeMillis();

   Iterator<Integer> iterator = list.iterator();

   while (iterator.hasNext()) {
       iterator.next();
   }

   System.out.println(list.getClass().getSimpleName() + "Use iterator to loop through the interval between" +
           (System.currentTimeMillis() - startTime) + "ms"); } public static void main(String[] args){// Test the access speed of 10000 integers List<Integer> arrayList = new arrayList <Integer>(10000); List<Integer> linkedList = new LinkedList<Integer>();for (int i = 0; i < 10000; i++){
       arrayList.add(i);
       linkedList.add(i);
   }

   loopList(arrayList);
   loopList(linkedList);
   System.out.println();
}
Copy the code

Let’s take a look at the output:

ArrayList uses normalforArrayList with iterator with 4ms LinkedList with normalforLinkedList uses iterator to loop between 133msCopy the code

Get (int index); get (int index); get (int index);

public E get(int index) {
   checkElementIndex(index);
   return node(index).item;
}
Copy the code

The node method compares the size of the index node with that of the size/2 node. The node method compares the size of the index node with that of the size/2 node. The node method compares the size of the index node with that of the end node. You can easily retrieve the element at the specified position through the index. So ArrayList has random fast access, whereas LinkedList doesn’t. So we should try to avoid using for loops when we use LinkedList.

So far we can summarize the differences between LinkedList and ArrayList:

  1. ArrayListIs the bottom of the array structure, storage space is continuous. Fast query, add and delete need to copy the array elements, when the deletion of elements in the early position of the performance is low.

  2. LinkedListThe bottom layer is the use of two-way linked list data structure, each node contains its previous node and the information after the node, storage space can not be continuous. Add and delete blocks, slow query.

  3. ArrayListLinkedListIt’s all thread unsafe. whileVectorThread-safe

  4. Try not to use a for loop to traverse a LinkedList collection, but instead use iterators or advanced for.

The Stack is introduced

From the original inheritance system, we can know that Stack inherits from Vector, that is, Stack has all the add, delete, change and query methods of Vector. But when we say Stack, we definitely mean the data interface in the Stack.

Let’s look at the stack definition first:

A stack, also known as a stack, is a linear table with limited operations. The limitation is that only one end of the table is allowed for insert and delete operations. This end is called the top of the stack, and the other end is called the bottom of the stack. Adding a new element to a stack is also called pushing, pushing, or pushing. It is done by placing the new element on top of the top element to make it the new top element. To remove an element from a stack is also called to stack or unstack. It is to remove the top element from the stack so that the adjacent element becomes the new top element.

To put it simply, the stack is a data structure that has a convention that adding and removing elements from the stack are only allowed at the top of the stack, and the elements that are pushed first are always removed later. We can use arrays and linked lists to implement this data structure operation of the stack.

Generally speaking, there are several operations on the stack:

  1. A push into the stack
  2. Pop out of the stack
  3. Peek query top of stack
  4. Empty whether the stack is empty

The Stack container in Java uses an array as the underlying structure to implement Stack operations. By calling the add and delete methods corresponding to the Vector, the Stack and outbound operations are implemented.

Public E push(E item) {addElement(item); // Call the addElement method defined by Vectorreturnitem; } // public synchronized Epop() { E obj; int len = size(); obj = peek(); removeElementAt(len - 1); // Calls the element at the end of the removeElementAt array defined by Vectorreturnobj; } public synchronized Epeek() {
   int     len = size();

   if (len == 0)
       throw new EmptyStackException();
   returnelementAt(len - 1); // Query the last element of the array. }Copy the code

The Stack implementation in Java containers was briefly introduced above, but the fact is that these antiquated collection container classes are not officially recommended. For stack from the data structure, compared with linear list, its implementation also exists, sequential storage (array), discontinuous storage (linked list) implementation methods. The LinkedList, which we saw at the end of the last article, replaces the Stack.

In a recent tech group, one of the meituan leaders said he interviewed an Android developer about his understanding of stacks. The topic of the interview was to implement a simple stack with basic peek, push, and POP operations. As a result, I don’t know why the interviewee didn’t write it out and was passed. So after analyzing the Stack, I decided to try to write the interview question myself manually. I think I’m the interviewer. If the person who answered the question only wrote out the push operation, it would be considered as failing. The interviewer should be concerned about whether StackOverFlow is considered when writing out the push operation.

Public class SimpleStack<E> {// Default capacity private static final int DEFAULT_CAPACITY = 10; Private Object[] elements; Private int size = 0; // private int top; publicSimpleStack() {
        this(DEFAULT_CAPACITY);
    }

    public SimpleStack(int initialCapacity) {
        elements = new Object[initialCapacity];
        top = -1;
    }

    public boolean isEmpty() {
        return size == 0;
    }

    public int size() {
        return size;
    }

    @SuppressWarnings("unchecked")
    public E pop() throws Exception {
        if (isEmpty()) {
            throw new EmptyStackException();
        }

        E element = (E) elements[top];
        elements[top--] = null;
        size--;
        return element;
    }

    @SuppressWarnings("unchecked")
    public E peek() throws Exception {
        if (isEmpty()) {
            throw new Exception("Current stack is empty");
        }
        return(E) elements[top]; } public void push(E element) throws Exception {// Make sure the capacity meets the condition ensureCapacity(size + 1); elements[size++] = element; top++; } private void ensureCapacity(int minSize) {if (minSize - elements.length > 0) {
            grow();
        }
    }

    private void grow() { int oldLength = elements.length; Int newLength = oldLength + (oldLength >> 1); elements = Arrays.copyOf(elements, newLength); }}Copy the code

Synchronous vs asynchronous

For Vector and Stack, the synchronized keyword is used to modify the method in the corresponding increment, deletion, change and check method from the source code, which means that the method is synchronized and thread-safe. Arraylists and LinkedLists are not thread-safe. However, we mentioned in our introduction to ArrayList and LinkedList that we can use the Collections static method to turn a List into a thread-synchronized List:

List<Integer> synchronizedArrayList = Collections.synchronizedList(arrayList);
List<Integer> synchronizedLinkedList = Collections.synchronizedList(linkedList);
Copy the code

So here’s another interview question:

Please summarize the difference between Vector and SynchronizedList.

SynchronizedList namely Collections. SynchronizedList (arrayList); The generated List type, which is itself a Collections inner class.

Let’s take a look at his source:

static class SynchronizedList<E> extends SynchronizedCollection<E> implements List<E> { private static final long serialVersionUID = -7754090372962971524L; final List<E> list; SynchronizedList(List<E> list) { super(list); this.list = list; } SynchronizedList(List<E> list, Object mutex) { super(list, mutex); this.list = list; }... }Copy the code

The SynchronizedList constructor takes an Object as an argument, but the word mutex means that it is a lock. In fact, when we click super, the constructor lock for a single argument is the Object itself.

SynchronizedCollection(Collection<E> c) {
       this.c = Objects.requireNonNull(c);
       mutex = this;
   }

SynchronizedCollection(Collection<E> c, Object mutex) {
  this.c = Objects.requireNonNull(c);
  this.mutex = Objects.requireNonNull(mutex);
}
Copy the code

Let’s take a look at the add, delete, change and check methods:

   public E get(int index) {
       synchronized (mutex) {returnlist.get(index); } } public Eset(int index, E element) {
       synchronized (mutex) {returnlist.set(index, element); } } public void add(int index, E element) { synchronized (mutex) {list.add(index, element); } } public E remove(int index) { synchronized (mutex) {returnlist.remove(index); } } public int indexOf(Object o) { synchronized (mutex) {returnlist.indexOf(o); } } public int lastIndexOf(Object o) { synchronized (mutex) {returnlist.lastIndexOf(o); } } public boolean addAll(int index, Collection<? extends E> c) { synchronized (mutex) {returnlist.addAll(index, c); Synchronized (mutex) public ListIterator<E>listIterator() {
       return list.listIterator(); // Must be manually synched by user
   }

   public ListIterator<E> listIterator(int index) {
       return list.listIterator(index); // Must be manually synched by user
   }
Copy the code

As you can clearly see, to make a collection thread-safe, Collocations simply wraps up the method of adding, deleting, changing, and checking a collection of parameters, with a synchronization constraint. The first difference between a Vector and a Vector is whether it is a synchronized method or a synchronized code block.

  1. A synchronized block of code may have a smaller scope of locking than a synchronized method, and generally the scope of locking is inversely proportional to performance.
  2. A synchronized block provides more precise control over the scope of the lock (the scope of the lock is the time from when it was acquired to when it was released), whereas a synchronized method’s scope is the entire method.

From the above two methods, ` ` Collections. SynchronizedList (arrayList); The generated synchronized collection looks more efficient, but the difference between Vector and ArrayList isn’t noticeable because the add methods internally implement much the same. Vector does not specify a lock as SynchronizedList does.

As we can see, SynchronizedList does not lock iterators, but the iterator methods on the Vector do lock iterators, so we need to manually synchronize iterators on SynchronizedList:

  synchronized (list) {
    Iterator i = list.iterator(); // Must be in synchronized block
    while (i.hasNext())
        foo(i.next());
 }
Copy the code

We can summarize three differences between SynchronizedList and Vector:

  1. SynchronizedListAs a wrapper class, it has good extensibility and compatibility. You can take all of themListTo a thread-safe class.
  2. useSynchronizedListGet iterator, traversal to manually synchronize processing, andVectorDon’t need.
  3. SynchronizedListYou can specify the object to lock with the parameter, andVectorIt can only be the object itself.

conclusion

This article is a supplement to the List family after the source analysis of ArrayList and LinkedList is completed. We analyzed

  1. VectorArrayListThe difference between.
  2. ArrayListandLinkedListThe difference betweenRandomAccessThe definition of this interface is justifiedLinkedListTraversal using the for loop is inefficient.
  3. StackInherited fromVectorThe operation is also thread-safe, but also older. Then analyzed the implementation of a simpleStackInterview questions.
  4. Finally, we conclude with thread safetySynchronizedListwithVectorThree differences.

These seem to be the kind of questions that interviewers like to ask and are often overlooked in their work. Through this article to make a corresponding summary, in case of need.