See the “Data Structures & Algorithms” series: 1. Introduction # Plan your Route

Data Structures and Algorithms: 5. Arrays

The list

Linked List is a common basic data structure. It is a linear List, but it does not store data in a linear order. Instead, it stores Pointers to the next node in each node.

Compared to arrays, linked lists are more difficult to master, and the corresponding forms will be more diverse. But whatever the form, the core is LInked List, a linear tabular data structure that links data to data.

May beginners or paste, continue to read, I believe that the concept of “data and data link” will have a more concrete understanding.

There are many kinds of linked list structures. This article introduces the three most common linked list structures: unidirectional, bidirectional and circular.

Singly linked list

To implement the simplest and most commonly used linked list, see LeetCode

Personal implementation with Golang code is as follows:

package main

import (
	"fmt"
)

// Node data
type LinkedListObject interface{}

// node structure
type SingleLinkNode struct {
	Data     LinkedListObject
	NextNode *SingleLinkNode	// Golang uses * to get the value of the memory address to which the pointer variable points
}

func main(a) {
	// New node 0
	node := new(SingleLinkNode)
	node.Data = 23

	// New node 1
	node1 := new(SingleLinkNode)
	node1.Data = 6
	node.NextNode = node1 // node1 is linked to node

	// New node 2
	node2 := new(SingleLinkNode)
	node2.Data = 15
	node1.NextNode = node2 // node2 is linked to node1

	// Loop print
	nowNode := node
	for {
		if nowNode == nil {
			break
		}
		fmt.Println(nowNode)
		nowNode = nowNode.NextNode
	}
}
Copy the code

Go Playground

In most cases, we will use the header (the first node) to represent the entire list.

It doesn’t matter if you don’t understand the code. The key is to understand the structure of linked list nodes. You can deepen your understanding by viewing the printed information of the above code.

Next, by comparing arrays, let’s see how the underlying storage structure of a one-way list differs.

Arrays require a contiguous chunk of memory; Linked lists, on the other hand, do not require contiguous memory, but use “Pointers” to connect discrete chunks of memory in series.

The linked list structure can overcome the disadvantage that arrays need to know the data size in advance. The linked list structure can make full use of computer memory space and realize flexible memory dynamic management.

In a one-way linked list diagram, you should notice that there are two nodes that are special, the first node and the last node.

The convention is to call the first node the head node, and the last node the tail node.

The header is used to record the base address of the linked list. And with that, we can go through the whole list. The special thing about endpoints is that instead of pointing to the next node, the pointer points to an empty address, NULL, indicating that this is the last node on the list.


Next, let’s examine “big O complexity”, an important metric for query, insert, and delete operations in linked lists.

Random access to the KTH element, because of non-contiguous memory, one-way linked list can only search, O(n).

Insert or delete? Since we do not care about memory continuity, we can allocate memory randomly or delete it as long as we change the pointer to handle, considering the magnitude, large O complexity = “O(1).

The basic complexity analysis method can be seen in the series of articles 3. Time & Space complexity (2)

The complexity analysis below is also skipped over

Circular linked list

The only difference is the end node. The end pointer points to the head node of the linked list.

Don’t underestimate this difference, if the data you need to deal with has the characteristics of a circular structure, it is suitable for the circular list.

Two-way linked list

Bidirectional linked list is a bit more complicated, linked list node added a pointer and node, or look at the picture is easier to understand.

A bidirectional list, as its name implies, supports two directions, with each node having not only a successor pointer next to the next node, but also a precursor pointer prev to the previous node.

Bidirectionally linked lists require two extra Spaces to store the addresses of successors and precursors. So, if you store the same amount of data, bidirectional lists take up more memory space than unidirectional lists. Although two Pointers are a waste of storage space, they support bidirectional traversal.

So what kind of problems are bidirectional lists better suited to solve than unidirectional lists?

Before, O(1) was inserted and deleted on the premise that the node position was known as the KTH, but in reality, we often do not know the position, and there will be the following two scenarios (in fact, the emphasis is on “search”) :

  • Delete a node whose value is equal to a given value
  • Deletes the node to which the given pointer points

The first one, whether it’s unidirectional or bidirectional, still requires traversal, which is order n.

Focus on the second one, assuming that the pointer to node B is known and the one-way linked list does not know the position of the leading node A (assuming that a is not the head node), the search still needs O(n) time complexity. The bidirectional list records the leading pointer, which is simplified to O(1) (~ ~ โ–ฝ ~) ~

The same is true for inserts. If the scene is known to be c, but you want to insert in front of it, the one-way list will still need to be traversed.

In addition to the above advantages, ordered list queries, two-way lists can compare the size to determine the forward/backward lookup, on average only need to find half of the data.

This reflects a “space for time” design idea, although the memory space is lost, but the code execution speed is improved.

The typical application of the design idea of “space for time” is cache technology and so on.

What if it’s a two-way plus cycle? If we connect the ends of a bidirectional list, it becomes a bidirectional circular list, as shown below:


Ok, with all the linked lists, do you understand the concept of “linking data to data”?

LRU cache elimination algorithm

You may wonder why you suddenly mention “LRU”, a popular interview question.

๐Ÿ™Œ This is a classic linked list application scenario, and has a wide range of applications. (Beginners do not directly skip, but try to think and understand, it is not difficult)

LRU is a common cache policy. It is usually mentioned together with two strategies: First In, First Out (FIFO) and Least Frequently Used (LFU).

LRU (Least Recently Used) — LRU (Least Recently Used)

For a specific scenario ๐ŸŒฐ, the cache of the existing user table, set the cache upper limit to 4 (smaller for easier understanding)

  1. How would you design a linked list if the business side accesses users 1, 2, 3, and 4 in sequence?
  2. Now that the business side has access to User 5, which one should be deleted? Where should I put User5?
  3. What if the business accesses User 4?

Try to think about it by yourself and draw ~ (*ยดโ–ฝ ‘) Blue Blue


If you look at the picture below, it shouldn’t be hard to understand, right? We implement LRU cache with single linked list

One might think, why is this different from the LRU THAT I remember before? (such as interview preparation by rote ๐Ÿ˜)

O(n) = O(n) = O(n)

Further optimizations can be made by introducing Hash tables to record the location of each data, thus reducing the time complexity of cache access to O(1).

Delete (User4); delete (User4);

๐Ÿ™Œ two-way linked list โœ…

The series will cover “hash tables” and “hash algorithms” in detail, and then go back to optimizing and analyzing LRU cache elimination algorithms

Let’s start with a picture

conclusion

Q: What are the advantages and disadvantages of linked lists compared to arrays?

๐Ÿ™Œ Compared with arrays, linked lists are more suitable for scenarios where insertion and deletion operations are frequent and the query time is more complex.

But in the actual software development, time complexity analysis cannot be limited.

Array is easy to use, in the implementation of the use of continuous memory space, you can use the CPU cache mechanism, preread the data in the array, so the access efficiency is higher. Linked lists are not stored consecutively in memory, so they are not CACHE-friendly and cannot be preread effectively.

The disadvantage of arrays is that they are fixed in size and occupy the entire contiguous memory space once declared. If the declared array is too large, it may run out of memory. If the declared array is too small, you may run out of space. Then you have to apply for a larger memory space and copy the original array into it, which is very time-consuming. The linked list itself has no size limit and naturally supports dynamic expansion.

In addition, if your code is very memory intensive, arrays are more suitable for you. Because each node in the linked list consumes extra storage space to store a pointer to the next node, memory consumption doubles. In addition, frequent insertion and deletion of linked lists will lead to frequent memory requests and releases, which can easily lead to memory fragmentation and Garbage Collection (GC) if the Java language is used.

Q: Can you give some examples of commonly used linked lists?

๐Ÿ™Œ Unidirectional, circular, and bidirectional lists (bidirectional circular lists)

Q: How to implement LRU cache elimination algorithm based on linked list?

๐Ÿ™Œ My idea is like this: to maintain an ordered single linked list, the nodes closer to the end of the list are accessed earlier. When a new data is accessed, we traverse the list sequentially, starting with the list header.

Hashmap + Double Linked List


This series of linked lists (part 2) will be specific to “combat” – linked list code implementation, the preview is as follows (refer to LeetCode) :