A List of containers

The List collection system should be the most commonly used API in daily development, and is usually used as the final question of the interview (JVM, collection, concurrency). The overall design of the collection code also integrates many programming ideas, which has high reference and reference value for programmers.

The bottom line

  • Basic: add and delete elements, container information;
  • Advanced: storage structure, capacity management;

The API system

  • ArrayList: Maintenance array implementation, query fast;
  • Vector: Maintains an array implementation, thread-safe;
  • LinkedList: maintenance of LinkedList implementation, add and delete fast;

Core features include: initialization and loading, element management, automatic expansion, array and linked list data structures. The underlying thread-safe operation of Vector is implemented based on ArrayList. ArrayList and LinkedList are non-thread-safe operations, and the natural efficiency is higher than Vector, which can be found through reading the source code.

ArrayList (ArrayList

1. Array features

ArrayList is the implementation class of the List interface in the collection system. The bottom layer maintains the Object array to load and manage data:

private static final Object[] EMPTY_ELEMENTDATA = {};
Copy the code

When it comes to array structures, the subconscious query is based on the index corresponding to the element, so it is fast. If the element is deleted, it may cause a large number of elements to move, so it is less efficient than LinkedList.

Array storage mechanism:

Belongs to an array is compact continuous storage through the subscript index can random access and quickly find the corresponding element, because there are open up the mechanism of memory space in advance, so relative to save storage space, if the array once need expansion, is to open up a bigger chunk of memory space, and copy all the data in the past, the efficiency is very low.

2. Construction method

Here are two main constructors:

No-argument constructor: Initializes an ArrayList, declaring an empty array.

public ArrayList(a) {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}
Copy the code

Parameter constructor: If the capacity argument is greater than 0, set the length of the array.

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+initialCapacity); }}Copy the code

If the array length is not specified in the constructor, the default array length is used and the array size is set in the add element operation.

private static final int DEFAULT_CAPACITY = 10;
Copy the code

3. Load data

Arrays are finite, but an ArrayList can always load elements, with boundaries, but usually not as many:

private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
Copy the code

Exceeding this limit throws an out-of-memory error.

Load elements: Determines whether the capacity is sufficient.

public boolean add(E e) {
    ensureCapacityInternal(size + 1);
    elementData[size++] = e;
    return true;
}
Copy the code

When the capacity is not enough, it will expand the operation, and here the key source code is attached:

private void grow(int minCapacity) {
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    elementData = Arrays.copyOf(elementData, newCapacity);
}
Copy the code

Mechanism: calculate newCapacity (newCapacity=15), copy a new array and set the capacity to newCapacity.

Specify where to add: This method is rarely used, again look at the two key lines of code;

public void add(int index, E element) {
    ensureCapacityInternal(size + 1);
    System.arraycopy(elementData, index,elementData,index+1,size-index);
    elementData[index] = element;
    size++;
}
Copy the code

Array copy = array copy = array copy = array copy = array copy = array copy

As shown in the figure above, suppose we put element E at index=1 and run as follows:

  • Get the length of array index to the end;
  • Copy to index+1;
  • Where index is, put element;

This process is just like queuing. If you insert one bit at the beginning, and all the others move back one bit, it’s inefficient, but it’s not absolute. If you move a small array, or if you keep adding at the end, it’s inefficient.

4. Remove data

See above data load, that is the opposite of the logic take a look again, still look at a few key source code:

public E remove(int index) {
    E oldValue = elementData(index);
    int numMoved = size - index - 1;
    if (numMoved > 0) {
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    }
    elementData[--size] = null;
    return oldValue;
}
Copy the code

The mechanism is similar to the logical mechanism of adding elements, i.e. copying the element after the added position to the position at the beginning of the index. This logic is similar to the way that the first part of the queue leaves one place, and the next part of the queue moves forward one place.

The efficiency problem is the same. If you remove the first element of the set, all the following elements will be moved. The further you remove the element, the lower the efficiency effect will be.

5. Capacity and quantity

In the collection source, there are two key fields that need to be clarified:

  • Capacity: the capacity of a collection;
  • Size: the number of elements in the container;

Usually, the container size is obtained by size, that is, the number of loaded elements. The capacity capacity will change only after continuous loading of elements triggers the expansion mechanism.

3. Linkedin List

1. Structural characteristics of linked list

The linked list structure is stored on physical cells in a discontinuous and non-sequential way. The logical order between node elements is realized by linking the order of Pointers in the linked list. A linked list consists of a series of nodes that can be dynamically generated at run time and consists of two parts: a data field that stores data elements and a pointer field that stores the address of the next node.

The characteristics of description

  • Physical storage is disordered and discontinuous;
  • A linked list is composed of multiple nodes in a chain structure.
  • Logically, it forms an ordered link structure.
  • The first node does not point to the address of the previous node;
  • The tail node has no address pointing to the next node;

Linked list structure solves the defect that array storage needs to know the number of elements in advance, and can make full use of memory space to achieve flexible dynamic memory management.

2. LinkedList structure

The underlying data storage structure of LinkedList is based on the LinkedList. First, let’s look at the description of the node:

private static class Node<E> {
    E item;
    Node<E> next;
    Node<E> prev;
    Node(Node<E> prev, E element, Node<E> next) {
        this.item = element;
        this.next = next;
        this.prev = prev; }}Copy the code

Define a static Node class in LinkedList that describes the structure of the Node: elements, front and back Pointers. Define three core variables in the LinkedList class:

transient int size = 0;
transient Node<E> first;
transient Node<E> last;
Copy the code

The size, the first node, and the description of these three variables are made very clear in the source code comments:

The last node of the first node is null, the next node of the last node is null, and item is not null.

3. Element management

A major feature of LinkedList is the efficiency of adding and deleting elements, according to the structural characteristics of the LinkedList source code.

Add elements

We can see from the source that this method is actually called when adding elements, putting the newly added element at the end of the original list:

void linkLast(E e) {
    final Node<E> l = last;
    final Node<E> newNode = new Node<>(l, e, null);
    last = newNode;
    if (l == null)
        first = newNode;
    else
        l.next = newNode;
    size++;
    modCount++;
}
Copy the code

Combined with the Node class constructor, the actual operation is shown below:

The core logic is: the new tail node and the old tail node to build a pointer relationship, and handle the first node variable.

Remove elements

An element can be deleted according to its object or index.

E unlink(Node<E> x) {
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;
    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }
    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }
    x.item = null;
    size--;
    modCount++;
    return element;
}
Copy the code

Similar to adding element core logic, it is a process of rebuilding node Pointers:

  • The two ifs determine whether the first node is deleted;
  • The next of the deleted node points to the next node of the deleted node.
  • The prev of the next deleted node points to the deleted prev node.

Through the source analysis of the addition and deletion method, it can be seen that the addition and deletion of LinkedList elements does not involve large-scale element movement, with very few affected nodes, and the efficiency is naturally much higher than that of ArrayList.

4. Query method

Storage based on linked lists rather than arrays has a significant impact on the efficiency of element queries.

public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}
Node<E> node(int index) {
    if (index < (size >> 1)) {
        Node<E> x = first;
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        returnx; }}Copy the code

This source code, combined with the LinkedList structure, is very strategic:

  • The first is the validity check of index;
  • Then determine whether index is in the top or bottom half of the list;
  • If in the top half of the list: the first node is traversed sequentially;
  • If in the lower half of the list: traversal from last in reverse order;

As can be seen from the above source code, the more times of traversal required to query the elements in the middle of the LinkedList, the lower the efficiency will be. Therefore, LinkedList is relatively more suitable for querying the first element.

Source code address

Making address GitEE, https://github.com/cicadasmile/java-base-parent, https://gitee.com/cicadasmile/java-base-parentCopy the code