Preface to the beginning
In the process of learning data structure and algorithm, I feel really deep into the algorithm, but the more I learn, the more interesting I feel. But what we find is that in the process of lifelong learning, we all learn more and more, we don’t know more and more, but we want to know more and we are more interested in knowledge.
The most common types of data structures are arrays, linked lists, and hops. In this installment, we’ll take a look at how they work and implement them.
Array Array
- Java, C++: int a[100]
- Python: list = []
- JavaScript: let x = [1, 2, 3]
In today’s high-level data languages, there are no strict requirements for the types in arrays, which are relatively diversified.
There’s a standard language term for generics, which means that any unit type can be put into an array.
Principles of arrays
- The underlying hardware implementation of arrays is one
Memory manager
; - When we ask the computer for an array, the computer actually gives us a sequence of addresses in memory;
- Each address is accessible through memory management;
- Whether we access the first value or one of the values, the time complexity is zero
Constant O (1)
; - And can access any element at will, so its access speed is very fast, is also one of the characteristics of the array;
The defect of arrays
The key problem with arrays is when adding and removing elements.
Array insert operation
Suppose we now define an array of [A, B, C, E, F, G] and we want to insert D into the array. Now suppose we want to insert D into pointer 3, how do we do that?
- First we need to put
E
.F
.G
All moved to their next hands; - Then add the
D
The pointer3
On;
Please check the effect picture for detailed implementation effect:
Because when we insert, we have to average half of the elements (N/2), so every time we insert an element in the array, the average is order N time.
Array delete operation
The same goes for deleting elements. Suppose we now have an array of [A, B, C, Z, D, E, F]. We now need to remove Z from the array. The implementation logic is as follows:
- First of all, let’s take the pointer
3
Is null; - Then put the
D
,E
,F
Three values move up one place; - Finally, for example,
Java
In the array language, we need to subtract the length of the array by one;
See the following figure for the specific implementation effect:
Because the deletion operation also requires moving an average of half of the elements (N/2), the average time of each deletion is O(N).
Array time complexity
Operation type | Time complexity |
---|---|
Query last (prepend) | O(1) |
Query next (append) | O(1) |
Query an element (lookup) | O(1) |
New node (insert) | O(N) |
Delete node (delete) | O(N) |
Linked List
Let’s take a look at another linked list of data structures. Linked lists were created to address the shortcomings of arrays.
Linked list features:
- Each element has two member variables
value
valuewithnext
Pointer to the(pointing to the next element); - When strung together, each element has a very similar structure to an array;
- The difference with arrays is thatEach element generally has one defined
Class
(class) : Usually oneNode
; Singly linked lists
: Only onenextPointer;Two-way linked list
: Have oneprevorpreviousPointer to previous element;- The Head pointer is represented by Head, and the Tail pointer by Tail.
- The next pointer to the tail pointer will all point to a None (null);
Circular linked list
: tail pointernextPointer to header pointer;
Add nodes to linked lists
Let’s look at the principle of adding a new element to a linked list:
- First, create a Node for the new element.
- Then we need to insert this new element between the next element;
- Put the previous element’s
next
Pointer to newNode; - Put the new element
next
Pointer to the latter element;
See the following figure for the specific implementation effect:
There are two insertions of the linked list, but they are constant, so the time is O(1).
Linked lists delete nodes
Now let’s take a look at the principle of deleting nodes. Deleting nodes is roughly the same as adding nodes
- Before the node that needs to be deletednodethe
next
Delete the next node insteadnode;
The actual effect is shown below:
The deletion of the linked list takes only one time, so the time complexity is also O(1).
Time complexity of linked lists
By analyzing the operations of adding and deleting linked lists, we find that there is no need to move half or more elements and duplicate elements like an array. And that’s because it’s very efficient for moving and modifying operations O(1). But when we query, when we need to access a value in the list, it becomes more complicated, O(N).
Let’s take a look at the time complexity of the linked list:
Operation type | Time complexity |
---|---|
Query last (prepend) | O(1) |
Query next (append) | O(1) |
Query an element (lookup) | O(N) |
New node (insert) | O(1) |
Delete node (delete) | O(1) |
After looking at the characteristics of Array and Linked List data structures, we can find that there is no perfect data structure. There is no need for arrays or Linked Lists if there are perfect ones. So we need to look at the scenario to decide what kind of data structure we need to use.
Skip List
Later, some technical scientists optimized the linked List and produced a third data structure called Skip List. Some of you may not be familiar with it, but it has been used in apps all around us. In Redis, skip tables are used. But you won’t be given a jump chart to program during the interview, so you just need to understand how it works.
The core of skip table is to optimize the time complexity of random access of linked list elements (O(n)).
The central idea of this optimization runs through the entire algorithmic data structure, and even the entire mathematical and physical world. That’s the idea of ascending dimension/space for time – literally adding a second dimension to an existing list called a first level index.
Add a primary index
Let’s see what a primary index is:
- The first index of the index points to the head, which is the first element (1).
- The next element of the index points to next + 1, the third element (4);
- In other words, the elements in the first-level index go twice as fast as the original list;
Suppose we now need to access node 7. How does adding this index speed up access? Let’s take a look at the picture below:
- The first step is to go from the first-level index to index 7;
- Find the 7th node from index 7;
- The total number of steps here is 4 and then you go down to 2 and you find the seventh node;
Although the speed is fast, but can it be faster? Yes, we just need to add dimensions, space for the central idea of time.
Add a secondary index
The second-level index moves one step faster than the first-level index, two steps at a time, which is next+2. So it’s much faster to access the nodes. First let’s take a look at the structure diagram after adding the second level index:
- In the same way, the first index of the secondary index is the first one that points to the primary index, which ultimately points to the head.
- The first node of the secondary index points to node 7, because the secondary index is next+2.
What happens when we access node 7 with the secondary index?
- When the dimension is upgraded to level 2, it takes only 1 step to reach the index of node 7.
- After adding the secondary index, we go from step 4 to step 1 to complete the access to node 7.
So it is clear that when we upgrade one more dimension, the speed of the linked list will increase correspondingly. In other words, in a very long list, we can add n-level indexes, that is, increase the dimension of the n-level to improve the speed of the list access. In general, we need to add log2n level indexes to reach the highest index dimension.
Time complexity analysis of hop table queries
- First of all, we increased the span of each index by 2 times, that is, we reduced the number of steps by 2 times, so n/2, N /4, N /8 and so on;
- The number of k-th index nodes is n/(2^k);
- Suppose the index has grade H, and the highest index has two nodes;
- N over 2 to the h is equal to 2, and from this formula we get h is equal to log2 of n minus 1;
- So the time complexity of the hop table is
O(log n)
Spatial complexity analysis of hop table queries
- First of all, the original list was length N
- If there is an index node for every 2 nodes, the number of nodes in each level of the index: n/2, N /4, N /8… , 8, 4, 2 and so on;
- Or so there is one index node for every three nodes, and the number of nodes in each index is n/3, N /9, N /27… , 9, 3, 1 and so on;
- So the space complexity is zero
O(n)
;
The jump represents the form in reality
- In practical use, the index of linked list is not so neat and regular;
- This is because elements change as they are added and removed;
- Finally, after several changes, some indexes took a few more or fewer steps;
- And maintenance costs are relatively high – all indexes need to be updated when added or deleted;
- Finally, the time complexity of updates during addition and deletion is also the same
O(log n)
;
We must write down the thought of ascending dimension and the thought of space changing time, and integrate them into one another. We’re going to use this a lot later when we’re going to solve the corresponding interview questions. For example, trees, binary search trees and so on often use high-level database structures.
Application in “Four” project
In fact, linked lists are a lot of applications in daily engineering, but because these are high-level data structures, whether Java, C++, JavaScript or Go language, these languages provide encapsulated data structures, we just need to directly use it.
The application of linked lists
One of the most common applications of linked lists is LRU Cache. LRU cache, which can be achieved by using a double linked list. Interested partners can try to achieve.
The application of skip tables
The word skip table is used in Redis. If you want to know more, you can search Redis’ skip list to dig deeper.
“Final” summary
- The data structure:
- An array of: Random query fast
O(1)
, but delete and insert are slowerO(n)
; - The list: Delete and insert fast
O(1)
, but random query is slowO(n)
; - Jump table: To improve the list of random queries, random queries can be improved to
O(log n)
, butHigh maintenance costs;
- An array of: Random query fast
- Thinking the key:
- Dimensional thinking + space for time
- application:
- Linked list: LRU Cache
- Jump table: Redis
I am three diamonds, a member of the galaxy of technology who has come with you to learn for life. Praise is power, attention is recognition, comment is love! See you next time 👋!
A PDF version of this series of articles and other materials is available in the “Algorithms Info” section of Techgalaxy!
A series of reading
- “How to learn Data Structures and Algorithms efficiently” — Nowadays, if you want to become a senior development engineer or enter a big factory, whether it is front end, back end or AI, algorithms are the most important. It doesn’t matter whether the position we need to enter the company ends up as an algorithm engineer. So when you don’t learn algorithms, grow up hair.
- Analyzing Temporal and Spatial Complexity – the core of algorithms, data Structures and Algorithms is all about this core. Its existence is also to solve our performance problems in the process of programming, but also let us have more advanced thinking and ideas, write better programs.