The article is available at Github.com/niumoo/Java… , more Java programmers need to master the core knowledge, welcome Star and advice. Welcome to pay attention to my public number, the article updated every week.
preface
To be honest, List definitely has a place in Java’s most used collection class. Like Map, it can be used in many scenarios, which is very convenient for our daily development, after all, the need to store a List can be seen everywhere. However, there are still a lot of students who do not understand the difference between ArrayList and LinkedList, which is a pity. Both of them are basic content in data structure. This article will start with the basic concept and analyze the source code implementation of both in Java. Look for the differences between the two, and finally think about what to do when using them.
This article will cover the following.
- This section introduces the concepts of linear tables and the data structures of arrays and linked lists in linear tables.
- Source code analysis of ArrayList, such as storage structure, expansion mechanism, data addition, data acquisition, etc.
- Perform source code analysis of the LinkedList, such as its storage structure, data insertion, data query, data deletion, and how the LinkedList is used as a queue.
- Summarize ArrayList and LinkedList.
The linear table
To study ArrayList and LinkedList, we must first understand what a linear list is. Here is a passage from Baidu Encyclopedia.
Linear tables are the most basic, simplest, and most commonly used data structures. A linear list is a type of data structure. A linear list is a finite sequence of n data elements having the same properties.
As you can see, linear tables are one of the most basic, simple, and commonly used data structures. It arranges data in a line (possibly logically), one after the other, so that each piece of data in a linear list has only two directions, whereas arrays, linked lists, stacks, and queues are linear lists in data structures. You can imagine a neat line.
Now you might wonder, if there’s a linear table, then there must be a nonlinear table? That’s right. Binary tree sum graph is a typical nonlinear structure. Don’t be scared by these fancy pictures. In fact, this article is very simple. Please give it a thumbs up after reading it.
An array of
Now that you know what a linear list is, it’s easy to understand arrays. Arrays are an implementation of linear lists. An array is a data structure consisting of elements of the same type. An array needs to allocate a contiguous chunk of memory for storage. Notice the key words, same type, continuous memory, like this.
Sorry about the wrong picture, like this.
The above figure can intuitively reflect the storage structure of the array, because the memory address of the array is continuous, the element type is fixed, all have the characteristics of fast search for a certain location of the element; Also, because the array requires a contiguous memory segment, the length is fixed at initialization and cannot be changed. An ArrayList in Java is essentially a wrapper around an array.
The list
A linked list is also a linear list. Unlike an array, a linked list does not require continuous memory for data storage. Instead, each node stores a pointer to the next node. So what should this list look like? Look at the picture.
Oh no, it’s the wrong picture. It’s like this.
The figure above shows the storage structure of a linked list. Each node in the figure has a pointer to the next node, which is called a one-way linked list. There are also linked lists where each node has a pointer to the previous node, which we call a bidirectional linked list. I’m not going to draw the picture, like this.
You can see that the linked list does not need to be stored continuously in memory, because the linked list is through the node pointer to the next or the previous node, as long as the head node is found, can be used to find the following node. However, as a result, linked lists require **O(n) time to find or access nodes at a location. However, the complexity of O(1)** can be achieved when inserting data, because only the node pointer needs to be changed.
ArratList
The concept of linear tables is introduced above, and two practical implementation examples of linear tables are given, namely arrays and linked lists. In the Java collection class ArrayList, the Array storage structure is actually used. ArrayList encapsulates the Array and adds convenient operations such as insertion, acquisition and expansion. Because the underlying structure of an ArrayList is an array, access is very fast, but adding and deleting are relatively inefficient because they move subsequent elements. So how does it work? Might as well dig into the source code to find out.
ArrayList storage structure
Looking at the source code of ArrayList, you can see that it is a simple array for data storage.
/** * The array buffer into which the elements of the ArrayList are stored. * The capacity of the ArrayList is the length of this array buffer. Any * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA * will be expanded to DEFAULT_CAPACITY when the first element is added. */
transient Object[] elementData; // non-private to simplify nested class access
/** * Shared empty array instance used for default sized empty instances. We * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when * first element is added. */
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};
/** * Default initial capacity. */
private static final int DEFAULT_CAPACITY = 10;
Copy the code
As you can see from the comments above, an ArrayList constructed with no arguments shares an array of length 0 DEFAULTCAPACITY_EMPTY_ELEMENTDATA. The first expansion occurs only when the first element is added, which also prevents more memory waste when creating objects.
ArrayList capacity expansion mechanism
We all know that the size of an array cannot be changed, so an ArrayList can obviously add elements over and over again. Its underlying layer is an array. How does that work? An ArrayList shares an array of length 0 when initialized, and is expanded for the first time when the first element is added. We can imagine that an ArrayList is expanded every time it runs out of space. So what is the mechanism of expansion?
Still starting from the source, trace the internal implementation of the add() method.
/**
* Appends the specified element to the end of this list.
*
* @param e element to be appended to this list
* @return <tt>true</tt> (as specified by {@link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
// Start checking if the array capacity is sufficient at the current insertion position
private void ensureCapacityInternal(int minCapacity) {
// Check whether the ArrayList is not initialized. If yes, initialize the ArrayList. The capacity is 10.
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);
}
// Insert index is larger than the current array length
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* @param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = elementData.length;
// The capacity expansion rule is the current capacity + The current capacity moves 1 bit to the right. That's 1.5 times.
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
// Whether the capacity is greater than the maximum value of Int
if (newCapacity - MAX_ARRAY_SIZE > 0)
newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
// Copy the elements to the expanded new ArrayList
elementData = Arrays.copyOf(elementData, newCapacity);
}
Copy the code
It is relatively simple to discover expansion logic through source code, and the specific expansion process is as follows:
-
Starts checking if the array is large enough at the current insertion location
-
Check whether the ArrayList is uninitialized. If yes, initialize the ArrayList. The capacity is 10.
-
Determines whether the current index to be inserted is larger than the capacity
- No greater than, insert the new element, the new process is complete.
-
If the required capacity is greater than the current capacity, expand the capacity.
-
Capacity expansion rule: Current capacity + Current capacity move 1 bit to the right. That’s 1.5 times.
int newCapacity = oldCapacity + (oldCapacity >> 1);
-
If the capacity is still smaller than the required minimum capacity, the required minimum capacity is used as the capacity.
-
If the capacity is larger than the default maximum capacity, Integer is used as the maximum capacity.
-
Copy the elements of the old array into the expanded new array
-
-
Insert the new element and the new process is complete.
ArrayList data is added
When analyzing the expansion above, we have seen the specific logic of adding an element, because the underlying array, so directly specify the subscript assignment, very simple.
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e; // Assign directly
return true;
}
Copy the code
However, there is another case of adding data, which is to specify the subscript position to be added. What’s the difference in logic here?
/**
* Inserts the specified element at the specified position in this
* list. Shifts the element currently at that position (if any) and
* any subsequent elements to the right (adds one to their indices).
*
* @param index index at which the specified element is to be inserted
* @param element element to be inserted
* @throws IndexOutOfBoundsException {@inheritDoc} * /
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
// Specifies that all elements are moved one bit after the start of the index
System.arraycopy(elementData, index, elementData, index + 1,size - index);
elementData[index] = element;
size++;
}
Copy the code
You can see that this new addition adds a key line, which moves the elements starting at the coordinates to be inserted one bit back so that the specified subscript can be used to place the new element.
For example, if you want to add 100 at index 3, all elements at index 3 will be moved one bit later.
One drawback of ArrayList, therefore, is that it is inefficient to insert new data randomly.
ArrayList data acquisition
Data subscripts get element values in one step, needless to say.
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
E elementData(int index) {
return (E) elementData[index];
}
Copy the code
LinkedList
The underlying LinkedList is a linear structure of a LinkedList. In addition to having a node object, a LinkedList may have one or two Pointers depending on whether it is one-way or bidirectional. So is LinkedList a single list or a bidirectional list?
LinkedList storage structure
If we look at the LinkedList source code, we can see that there is no operation in the LinkedList parameterless construction, but by looking at the variables first and last, we can see that they are the first and last nodes of the storage list.
transient int size = 0;
/** * Pointer to first node. * Invariant: (first == null && last == null) || * (first.prev == null && first.item ! = null) */
transient Node<E> first;
/** * Pointer to last node. * Invariant: (first == null && last == null) || * (last.next == null && last.item ! = null) */
transient Node<E> last;
/** * Constructs an empty list. */
public LinkedList(a) {}Copy the code
The variables first and last are of type Node.
private static class Node<E> {
E item;
Node<E> next;
Node<E> prev;
Node(Node<E> prev, E element, Node<E> next) {
this.item = element;
this.next = next;
this.prev = prev; }}Copy the code
You can see that this is a typical two-way list structure, where item is used to store the value of the element; Next points to the next node and prev to the previous node.
Get the LinkedList data
Linked lists are not contiguous memory addresses like arrays. Linked lists point to record link paths via next and prev, so finding a node at a given location can only be done by traversing the source code.
public E get(int index) {
checkElementIndex(index);
return node(index).item;
}
/** * Returns the (non-null) Node at the specified element index. */
// Iterate over the node information at index position
Node<E> node(int index) {
// assert isElementIndex(index);
// Determine whether index is in the first or second part of the current list, and then decide from
// First looks forward or last looks forward.
if (index < (size >> 1)) {
Node<E> x = first;
for (int i = 0; i < index; i++)
x = x.next;
return x;
} else {
Node<E> x = last;
for (int i = size - 1; i > index; i--)
x = x.prev;
returnx; }}Copy the code
Find the node object at the specified location. Note that the search first determines whether the index is in the first or second part of the current list, and then determines whether to look backwards from first or forwards from last. This will increase the speed of the search. You can also see from this that linked lists are not efficient in finding elements at a given location.
LinkedList data added
Since LinkedList is a LinkedList, the addition of LinkedList is also the addition of LinkedList data. At this time, it is necessary to distinguish the operation according to the position to be inserted.
-
Insert the tail
public boolean add(E e) { linkLast(e); return true; } void linkLast(E e) { final Node<E> l = last; // New node, prev is the current tail node, e is the element value, next is null, final Node<E> newNode = new Node<>(l, e, null); last = newNode; if (l == null) first = newNode; else // The current tail node next points to the new node l.next = newNode; size++; modCount++; } Copy the code
Create a new Node. The prev of the new Node is set to the existing end Node, the existing end Node points to the new Node, and the next Node is set to null.
-
In the middle of new
Here is the source code involved in adding the element at the specified location.
public void add(int index, E element) { checkPositionIndex(index); if (index == size) // If the position is the end of the current list, insert it directly linkLast(element); else // Get the index node and insert a new element linkBefore(element, node(index)); } /** * Inserts element e before non-null Node succ. */ // Add a new element at the specified node, change the next node of the specified element to the new element, and the next node of the new element is the next node of the found node. // The last node of the new element is the found node. The next point to the found node is changed to the new node void linkBefore(E e, Node<E> succ) { // assert succ ! = null; final Node<E> pred = succ.prev; final Node<E> newNode = new Node<>(pred, e, succ); succ.prev = newNode; if (pred == null) first = newNode; else pred.next = newNode; size++; modCount++; } Copy the code
As you can see, there are two main parts to insert elements in the specified location. The first part is to find the Node node. This part is the LinkedList data retrieval part introduced above.
The second part is to insert the element after finding the Node object. The next of a node is the new node, the prev of the new node is the found node, and the next of the new node is the next of the found node. The prev of the node pointed to by the next of the found node is changed to the new node.
LinkedList data deleted
Still view the source code for analysis, see in the source code if the node is the head node or tail node, delete relatively simple. We’ll focus on deleting the middle node
public E remove(int index) {
checkElementIndex(index);
return unlink(node(index));
}
/** * Unlinks non-null node x. */
E unlink(Node<E> x) {
// assert x ! = null;
final E element = x.item;
final Node<E> next = x.next;
final Node<E> prev = x.prev;
if (prev == null) {
first = next;
} else {
prev.next = next;
x.prev = null;
}
if (next == null) {
last = prev;
} else {
next.prev = prev;
x.next = null;
}
x.item = null;
size--;
modCount++;
return element;
}
Copy the code
The node(index) method is still binary to find the target location and then delete it. For example, the node to be deleted is called X, and the deletion operation is mainly to change the next pointing of the prev node of X node to the next pointing of X node, and change the prev pointing of the next node of X node to the prev pointing of X node. Finally, clear the prev and next directions of the X node. If it’s a little hard to understand, look at the picture below.
extension
You think of LinkedList as just a List, but it implements not only the List interface, but also the Deque, so it’s ostensibly a List, but it’s actually a queue.
public class LinkedList<E> extends AbstractSequentialList<E>
implements List<E>, Deque<E>, Cloneable.java.io.Serializable
Copy the code
Experience the fifO queue.
Queue<String> queue = new LinkedList<>();
queue.add("a");
queue.add("b");
queue.add("c");
queue.add("d");
System.out.println(queue.poll());
System.out.println(queue.poll());
System.out.println(queue.poll());
System.out.println(queue.poll());
/ / the result:
// a
// b
// c
// d
Copy the code
First in, first out (FIRST-in, first-out (FIFO)); poll delete the first node.
conclusion
Both ArrayList and LinkedList are commonly used collection classes in development. This article analyzes the underlying implementation of both. Through the analysis of the underlying implementation, we can summarize the main advantages and disadvantages of both.
- Traverse, ArrayList every timeDirect positioning, LinkedListNext Node LocationIt’s about the same. Note here that LinkedList is only usedThe iteratorThe next node will be used. If you are using
get
, because traversal search efficiency is low. - New, the ArrayList may need to expand, in the middle insert, ArrayList needs to move all the elements after the insertion position. LinkedList changes the node prev directly, next points to it, and LinkedList wins.
- Delete, same as 2.
- Random access to a given location, ArrayList directly located, LinkedList from start to finish, array wins.
To sum up, ArrayList is suitable for storing and accessing data, while LinkedList is more suitable for data processing. I hope you can choose List structure reasonably when using it in the future.
The last word
I open source all my articles at Github.com/niumoo/Java… , welcome Star and Perfect, hope we become excellent together.
If the article is helpful, you can click”praise“Or”share“, all support, I love them all!
This article is updated every week. To follow my updated articles and the dry goods I share in real time, you can follow”Did not read the code“The official account orMy blog.