1. A classic application scenario of linked lists

LRU cache elimination algorithm: cache is a technology to improve the performance of data reading, in hardware design, software development has a very wide range of applications, such as common CPU cache, database cache, browser cache and so on. The size of the cache is limited. When the cache is full, which data should be purged and which data should be retained? This requires cache elimination strategy to determine, and there are three common cache elimination strategies: FIFO (First In First Out) : LFU (Least Frequently Used) and LRU (Least Recently Used) are Used at Least.

So how do you use linked lists to implement an LRU cache elimination strategy?

2. The underlying structure of linked lists

Arrays need a contiguous memory space to store. Compared with arrays, linked lists do not need contiguous memory blocks. They connect a group of scattered memory blocks in series through “Pointers”.

2.1 a singly linked list

A linked list uses Pointers to concatenate a set of discrete blocks of memory, which we call the “nodes” of the list. To connect all the nodes of a linked list, each node needs to store data as well as the address of the next node above the list. We call the pointer that records a node next. In a singly linked list, there are two special nodes, the first node and the last node, and we habitually call the first node the head node and the last node the tail node. The header is used to record the base address of the linked list, which can be used to traverse the entire list. The special thing about endpoints is that instead of pointing to the next node, the pointer points to an empty address, NULL, indicating that this is the last node in the list.

Lists support lookup, insert, and delete operations just like arrays. In order to maintain the continuity of the memory space, a large amount of data needs to be moved when the array is inserted or deleted, so the time complexity is O(n). Inserting or deleting data into a linked list does not require moving nodes to maintain memory continuity, because the storage space of a linked list is inherently discontinuous. For the insertion and deletion operations of the linked list, we only need to consider the pointer changes of adjacent nodes, so the time complexity of the insertion and deletion of the linked list is O(1).

Good is bad, the random access list want the first element of K, is less efficient than an array, because the data is stored in the list is not continuous, so I can’t like array according to the first address and subscript, by addressing formula direct access to the memory address of the corresponding element, but need according to the pointer to a node in a node to traverse, until you find the corresponding node.

2.2 Circular linked lists

Circular linked list is a special single linked list. The only difference from a singly linked list is at the end. The end pointer of a singly linked list points to an empty address, indicating that it is the last node. The endpoints of a circular list point to the head of the list and are connected end to end like a loop, so they are called “circular” lists.

Compared with single linked list, circular linked list has the advantage of being more convenient from chain end to chain head. When the data to be processed has the characteristics of ring structure, circular linked list is especially suitable for using, such as the famous Joseph problem.

2.3 Bidirectional linked lists

A singly linked list has only one direction, and a node has only one successor pointer, next, to the next node. A bidirectional list, as its name implies, supports two directions, with each node having not only a successor pointer next to the next node, but also a precursor pointer prev to the previous node. Since bidirectional lists require two extra Spaces to store the successor nodes and the precursor nodes, bidirectional lists take up more memory space than unidirectional lists to store the same amount of data. Although a waste of space, bidirectional lists support bidirectional traversal, which gives the flexibility of bidirectional list operations.


What kind of problems are bidirectional lists suitable for solving?

Bidirectional linked lists can support O(1) time complexity to find the precursor node, which makes bidirectional linked lists simpler and more efficient in some cases than single linked lists for insertion and deletion.

In real software development, removing a piece of data from a linked list is one of two things:

A) Delete nodes whose “value is equal to a given value”;

B) Delete the node pointed to by the given pointer

In the first case, whether it is a single list or a bidirectional list, in order to find a node with a value equal to the given value, we need to start from the beginning node and compare them one by one until we find a node with a value equal to the given value, and then delete it by using the pointer operation described earlier.

Although the time complexity of simple deletion is O(1), the traversal search time is the main time consuming point, and the corresponding time complexity is O(n). According to the addition rule in time complexity analysis, the total time complexity of the linked list operation corresponding to the node whose value is equal to the given value is O(n).

In the second case, we have found the node to be deleted, but to delete a node Q, we need to know its precursor node, and the single linked list does not support directly obtaining the precursor node. In order to find the precursor node, we still need to traverse the linked list from the beginning node until p->next=q, indicating that P is the precursor node of Q.

For bidirectionally linked lists, however, this situation is advantageous because the nodes in a bidirectionally linked list already hold Pointers to their predecessors and do not need to be traversed like a single-linked list. So, for the second case, single-linked list deletion takes O(n) time, whereas bidirectional lists only take O(1) time.

Similarly, if we insert a node in front of a specified node in a linked list, bidirectional lists can be done in order (1) time, whereas unidirectional lists need order (n) time.

The Java LinkedHashMap container is implemented using a two-way linked list data structure.

Here is a very important idea: the design idea of space for time. When memory is available, we can choose algorithms or data structures with relatively high spatial complexity but low time complexity if we are more interested in the speed of code execution.


The opening page cache is an example of a space-for-time design.

Programs that execute slowly can be optimized by consuming more memory (space for time), while programs that consume too much memory can be reduced by consuming more time (time for space).


Cyclic and bidirectional lists can be combined into a new version: bidirectional cyclic lists.


3. Performance comparison of linked lists and arrays

Array: Insert delete ->O(n), random access ->O(1)

Linked list: Insert delete ->O(1), random access ->O(n)


The comparison of arrays and linked lists is not limited to time complexity, and complexity alone cannot be used to determine which data structure to store data in real software development.


Array is easy to use, in the implementation of the use of continuous memory space, you can use the CPU cache mechanism, preread the array of data, so the access efficiency is higher. Linked lists are not stored consecutively in memory, so they are not CACHE-friendly and cannot be preread.

The disadvantage of arrays is that they are fixed in size, once declared, they take up the entire contiguous block of memory. If the declared array is too large, the system does not have enough contiguous memory to allocate to it, resulting in “out of memory”. If the declared array is too small, it may not be enough, which means that you have to apply for a larger memory space, and copy the original array into it, which is very time-consuming. The linked list itself has no size limit and naturally supports dynamic expansion.


4

To maintain a single linked list, nodes closer to the end of the list are accessed earlier. When new data is accessed, the list is traversed sequentially from the head.

A) If this data has been cached in the linked list before, we iterate to get the node corresponding to this data, delete it from its original position, and then insert it into the head of the linked list.

B) If the data is not in the cache linked list, it can be divided into two cases:

B1) If the cache is not full at this time, insert this node directly into the head of the linked list,

B2) If the cache is full at this time, the end of the linked list is deleted and the new data node is inserted into the head of the linked list.


The time complexity of cache access in this scheme is O(n) without other data structure optimization.


5 subtotal

Linked lists, like data, are very basic, very common data structures. However, linked lists are slightly more complex than arrays. There are several types of linked list structures derived from common single lists, such as bidirectional lists, circular lists, and bidirectional circular lists.

Compared with arrays, linked lists are more suitable for frequent insertion and deletion operations, and the time complexity of query is higher. In specific software development, the performance of arrays and linked lists should be compared to choose which one to use.


Data Structures and Algorithms: Linked Lists