preface
During an interview, you are often asked the following questions:
The difference between ArrayList and LinkedList
ArrayList is implemented based on arrays, LinkedList is implemented based on linked lists, ArrayList is more efficient than LinkedList when accessing a List randomly, and so on
When asked what the usage scenarios of ArrayList and LinkedList are, most of your friends will probably answer:
ArrayList and LinkedList are more efficient than ArrayList in adding and deleting elements, and ArrayList is more efficient than LinkedList in traversing
Is that an accurate answer? Today we are going to study it!
01 Source Code Analysis
1.1 ArrayList
The implementation class
public class ArrayList<E> extends AbstractList<E>
implements List<E>, RandomAccess.Cloneable.java.io.Serializable1. 2.Copy the code
ArrayList implements the List interface, inherits AbstractList abstract class, implements arrays at the bottom, and implements self-increasing array sizes.
ArrayList also implements the Cloneable and Serializable interfaces, so it can implement cloning and serialization.
ArrayList also implements the RandomAccess interface, a flag interface that says “any List class that implements this interface can achieve fast RandomAccess.”
Basic attributes
The ArrayList property is mainly composed of array length size, object array elementData, and default_capacity, which is 10 by default.
// The capacity is initialized by default
private static final int DEFAULT_CAPACITY = 10;
// Array of objects
transient Object[] elementData;
// Array length
private int size;1.23.4.. 56..
Copy the code
From the ArrayList property, elementData is decorated with the transient keyword, which means that the property will not be serialized.
But ArrayList actually implements the serialization interface. Why?
Because the array of an ArrayList is dynamically amplified, not all of the allocated memory is storing data.
If the external serialization method is used to serialize the array, the entire array will be serialized. In order to avoid the serialization of the memory space that does not store data, ArrayList provides two private methods to complete the serialization and deserialization. This saves space and time when serializing and deserializing arrays.
Therefore, using transient to decorate the array prevents the object array from being serialized by other external methods.
ArrayList customizes the serialization method as follows:
Initialize the
There are three initialization methods: direct initialization without parameters, specified size initialization, specified initial data initialization, the source code is as follows:
When a new element is added to an ArrayList, it calculates the size of the element and dynamically expands it if it exceeds its existing size. Expanding an array causes a memory copy of the entire array.
Therefore, when we initialize an ArrayList, we can use the first constructor to specify the initial size of the array, which helps to reduce the number of arrays to expand, thus improving system performance.
When the ArrayList no-argument constructor initializes, the default size is an empty array, not the usual 10, which is the value of the array expanded when we first add it.
The new element
There are two ways to add elements to an ArrayList, either by adding them directly to the end of the array or by adding them to an arbitrary location.
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacityInternal(size + 1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}1.23.4.. 56.7.8.9.10.11.12.13.14.15..
Copy the code
The similarity between the two methods is that before adding elements, the capacity is confirmed. If the capacity is large enough, it is not needed to expand. If the capacity is not large enough, the array is expanded by 1.5 times the size of the original array. After the expansion, the array needs to be copied to the newly allocated memory address.
Here is the specific source code:
There is also a difference between the two methods. Adding an element to any position causes all elements after that position to be rearranged, whereas adding an element to the end of the array does not duplicate the sort of elements without expansion.
So an ArrayList doesn’t have to be slow with lots of new elements
If we know the size of the data stored at initialization, specify the size of the array when we initialize an ArrayList, and add elements only at the end of the array when we add elements, then an ArrayList performs better than other lists with a large number of new elements, rather than worse.
Remove elements
Arraylists can delete elements in many ways, such as by array index, by value, or in batches. The idea is similar.
An ArrayList reassembles the array after each valid deletion, and the more advanced the deleted elements, the more expensive the array reassembles.
We choose to delete according to the value of the way to source note:
Traverse elements
Because ArrayList is implemented based on arrays, it is very fast to get elements.
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
E elementData(int index) {
return (E) elementData[index];
}1.23.4.. 56.7.8.9..
Copy the code
1.2 LinkedList
LinkedList is implemented based on a two-way LinkedList data structure.
The bidirectional list structure, in which each node can be traced back and forth, has several concepts as follows:
- Each Node in the list is called Node, and Node has a prev attribute, which represents the position of the previous Node, and a next attribute, which represents the position of the next Node;
- First is the head node of a bidirectional list, and its preceding node is NULL.
- Last is the last node of the bidirectional list, and its next node is NULL;
- When there is no data in the linked list, first and last are the same node, both pointing to null;
- Because it is a bidirectional linked list, there is no size limit as long as the machine memory is strong enough.
The Node structure contains three parts: the element content item, the prev pointer, and the next pointer.
private static class Node<E> {
E item;/ / the node values
Node<E> next; // Point to the next node
Node<E> prev; // The previous node to point to
// Initialization parameters are in the order of the previous node, the value of its own node, and the last node
Node(Node<E> prev, E element, Node<E> next) {
this.item = element;
this.next = next;
this.prev = prev; }}1.23.4.. 56.7.8.9.10.11.12..
Copy the code
LinkedList is a bidirectional LinkedList composed of Node structure objects.
The implementation class
Class LinkedList implements List interface and Deque interface, and inherits AbstractSequentialList class. LinkedList implements both List type and Queue type. LinkedList also implements Cloneable and Serializable interfaces and, like ArrayList, can be cloned and serialized.
Because the memory address of LinkedList storing data is discontinuous, but the discontinuous address is located by pointer, therefore, LinkedList does not support random fast access, and LinkedList cannot implement RandomAccess interface.
public class LinkedList
extends AbstractSequentialList
implements List.Deque.Cloneable.java.io.Serializable12.3.Copy the code
Basic attributes
transient int size = 0;
transient Node first;
transient Node last;1.23..
Copy the code
We can see that all three attributes are transient, for the simple reason that we do not serialize only head to tail, so LinkedList also implements readObject and writeObject serialization and deserialization.
Here’s how to customize the serialization of LinkedList.
Nodes in the query
LinkedList (LinkedList, LinkedList, LinkedList, LinkedList, LinkedList)
Instead of looping from beginning to end, LinkedList takes a simple dichotomy to see if the index is in the first or second half of the list.
If it’s the first half, start from the beginning, and vice versa. In this way, the number of loops is reduced by at least half and lookup performance is improved.
The new element
The implementation of adding elements to LinkedList is simple, but there are many ways to add elements.
The default add (Ee) method is to add the added element to the end of the queue, first by replacing the last element with a temporary variable to generate a new Node object, then by referring to the last object to the new Node object, with the previous last pointer to the new Node object.
LinkedList also has a way to add elements to any location. If we were to add an element to the middle of any two elements, adding elements would only change the Pointers to and from the elements before and after, which would point to the new element to be added, so LinkedList has a significant performance advantage over ArrayList.
Remove elements
To delete an element from a LinkedList, we first loop to find the element to delete. If the element to delete is in the first half of the List, we look backwards. If it’s in the second half, look from back to front.
Doing so can be very efficient in removing both the earlier and later elements, but it can be relatively inefficient if a List has a large number of elements and the elements removed are in the middle of the List.
Traverse elements
The implementation of the LinkedList element fetch operation is basically similar to the LinkedList element delete operation. The implementation of the LinkedList element fetch operation is similar to the LinkedList element delete operation. The implementation of the LinkedList element fetch operation is similar to the LinkedList element delete operation.
So in LinkedList loop, we can iterate through the loop in an iterator fashion, fetching our elements directly instead of going through the loop to look up the List.
02 Analysis test
Added element operation performance tests
Test case source code:
- ArrayList:paste.ubuntu.com/p/gktBvjgMG…
- LinkedList:paste.ubuntu.com/p/3jQrY2XMP…
Test results:
From this set of tests, we know that LinkedList is not necessarily more efficient at adding elements than ArrayList.
Adds elements from the collection header position
Since ArrayList is an array implementation, when adding elements to the array header, the data after the header needs to be copied and rearranged, so the efficiency is very low.
LinkedList is implemented based on a LinkedList. When adding an element, the first step is to find the position of the added element through a loop. If the position to be added is in the first half of the List, the first step is to find the position from front to back. If it’s in the last half, it goes from back to front, so it’s very efficient to add a LinkedList element to the head.
Adds elements from the middle of the collection location
When an ArrayList adds elements to the middle of an array, some data also needs to be copied and rearranged, which is not very efficient.
LinkedList adds elements to the middle, which is the least efficient way to add elements, because near the middle, the loop lookup before adding elements is the most iterated operation.
Adds elements from the end of the collection
ArrayList is more efficient than LinkedList in adding elements to the tail without scaling.
This is because arrayLists are very efficient without copying and rearranging data as they add elements to the tail.
LinkedList is also less efficient than ArrayList because it does not loop through elements, but because of new objects and the process of changing Pointers to objects.
Note: This test excludes dynamic array capacity expansion, which would also make ArrayList less efficient.
Delete element operation performance test
ArrayList and LinkedList test results for removing elements are very similar to those for adding elements!
Conclusion: If you need to do a lot of insertions and deletions in the head of a List, choose LinkedList. Otherwise, ArrayList will do.
Iterate over elements to perform performance tests
Test case source code:
- ArrayList:paste.ubuntu.com/p/ZNWc9H2pY…
- LinkedList:paste.ubuntu.com/p/xSk4nHDHv…
Test results:
We can see that LinkedList has the worst for loop performance and ArrayList has the best for loop performance.
This is because LinkedList is implemented based on linked lists. When using the for loop, each for loop will traverse half of the List, so the efficiency of traversal is seriously affected. ArrayList is implemented based on arrays and implements the RandomAccess interface flag, which means that ArrayList can achieve fast RandomAccess, so the for loop is very efficient.
LinkedList iteration loop traversal is as good and not too bad as ArrayList iteration loop traversal, so avoid using for loop traversal when traversing LinkedList.