Android Skill Tree - Array, linked list, hash table basic summary

Preface:

Now android interview, for the data structure of the problem is more and more, more and more requirements, so I can only slowly fill up the data structure. (Huo ꈍ ꈍ huo)

Android Skill Book Series:

Android Basics

Android Skill Tree – Animation summary

Android Skill Tree – View summary

Android Skill Tree – Activity summary

Android Skill Tree – View event system summary

Android Skill Tree – Summary of Android storage paths and I/O operations

Android Skill Tree – Multi-process related summary

Android skill Tree – Drawable summary

Basic knowledge of data structures

Android Skill Tree – Array, linked list, hash table basic summary

Android Skill Tree — Basic Knowledge of The Tree

Basic knowledge of algorithms

Android Skill Tree – Sorting algorithm basics summary

This article mainly talks about array, linked list, hash table (hash table).

When we go to the movies, we know there will be a locker in front of the theater,

There will be consecutive numbers on it, one drawer after the other. Then you will put your things in the corresponding small drawer and go to watch the movie.

When we store data into memory, you ask the computer for storage space, the computer gives you a storage address, and you store the content in it. Kind of like the lockers up there.

The linear table

Linear list: a finite sequence of zero or more data elements.

Linear table sequential storage (array) :

If you have three bags of stuff, and you can only store one bag of stuff in one drawer, then you can use three cabinets in a row. Let’s say you use drawer 0102,03.

Sequential storage structure of a linear table: a sequence of contiguous storage cells is used to store the data elements of a linear table.

Others to then use the 04, drawer, and by this time your friend give you a bag of things, also said that help to save, but this time because of no. 4 drawers have been used by others, and you because all the things required in accordance with the order together, so this time you can only find continuous drawer together, such as the euro,09,10,11. What if number 12 is used and you need to store an extra bag?

Here we see the characteristics of arrays:

If we have four bags of goods, we already know that the first bag of goods is in the n-number drawer, then the other three must be in N+1, N+2 and N+3, so it is very convenient to inquire, because we only need to know the location of one, and the other locations are known. (So it’s very easy to query, because all the locations know exactly where they are)
If we put A at 01, B at 02, and C at 03, then we say insert A D between A and B, and then we need to move both B and C back. And the same is true for deleting one, so if we delete A, B and C move forward. (So insert and delete is troublesome, need to move all the data behind)
If you suddenly have one more item that needs to be stored, and it’s running out of space, move it all back to a new continuous storage location.

It’s like when we’re standing in line to buy a train ticket, and someone cuts in line, and you all have to step back one step; The person at the front buys a ticket and leaves, so all of you can move forward.

An array of	Time complexity
read	O(1)
Insert/delete	O(n)

Linear list chain storage (linked list) :

Singly linked lists:

I don’t know if you’ve ever seen a treasure hunt movie like Tomb Raider.

Their process is to arrive at A location, then arrive at the first destination A, then follow the clues to know where the next destination B is, then go to B, and so on A– B– C –….. And so, all the way to the final hiding place. Yes, our linked list is something like this, like we know there are four bags of items, but you can’t directly know where the last item is, you have to start from the first and go down.

For example, the first drawer is stored in No. 01 with the content of A. At the same time, we tell everyone that the next item is in no. 05 with the content of B, and the next one is in No. 08.

As can be seen from the above figure, a node consists of a data field storing data elements and a pointer field storing the addresses of subsequent nodes.

As we can see from the tomb Raider plot above, we can’t directly know where the last clue is, we can only look through it from beginning to end, so the linked list is slow to read. But it’s very convenient if we want to insert and delete.

Let’s say we want to insert a new node:

Let’s say we want to delete one of the nodes:

The list	Time complexity
read	O(n)
Insert/delete	O(1)

Circular linked lists:

Changing the pointer end of the end node in the single linked list to point to the head node will make the whole single linked list form a ring. This kind of single linked list is called single circular linked list, or circular linked list for short.

Bidirectional linked list:

Bidirectional linked list is to set a pointer field to its precursor node in each node of the single linked list.

Static linked lists:

The purpose of a static list is to make it possible for high-level languages without Pointers to implement lists using arrays.

I’ll just use a screenshot from the Internet to illustrate:

A static list is a sequential storage structure similar to an array. It is contiguous in physical addresses and requires pre-allocation of address space size. So the initial length of a static list is usually fixed. And then when I store it in there, I not only store the data field, but I also store the location of the next array index. This is equivalent to replacing the pointer field with the index value of the array.

Hash table:

We already know the advantages and disadvantages of arrays and linked lists.

operation	An array of	The list
read	Good at (random/sequential access)	Not good at (sequential access only)
Insert/delete	Not very good at	Good at

With that in mind, we can introduce hash tables with specific story requirements:

If you had a fruit store one day, you would have a book to write down the prices of various fruits, and because you know arrays are easy to read, we would have an array to write down the prices of various fruits, and we would write them in alphabetic order.

At this point, if someone asks Apple, you can check the price, but if there are A lot of fruits, or even A lot of fruits that start with A, for example, there are 20 fruits that start with A, then you can only know that the first 20 fruits that start with A are the first 20, but you have to check them one by one. It would be nice if we immediately knew the index of the Apple array, so we immediately knew where it was in the array, and we could read it right away.

Here’s an example:

So we’re going to store the price of an apple at index 2, and then we’re going to store the price of a banana at index 8, and so on, all the fruit, so that when the customer asks you the price of an apple, you immediately know to read it at index 2.

Turning an Apple into a 2 is done using a hash function.

Hash function:

To implement the above requirements, the hash function requires some basic requirements:

If you type in the same thing, you get the same value every time, so if you type in Apple every time, you get 2 every time, you can’t get 2 all at once, 5 all at once.
If you get the same result no matter what you type, this function is useless. If you type Apple and Banana, you get the same result.
The hash function needs to return a valid index. For example, if the length of our array above is only 40, the output of 100 when you type Pair is invalid.

From the above we know that when we input different values, the best result after the hash function conversion is that each value is different, in which case their index is different.

But if we only have an array of length 6 and we have 7 fruits, then two fruits must have the same index. This is what we call conflict.

There are many ways to handle collisions, but the simplest way is to store a linked list at that location if two keys map to the same location.

That way, we’re still fast on other fruits, but a little slower on fruits with index 0 because we need to find the fruit in the list with index 0.

Hash table operation	On average	The worst situation
To find the	O(1)	O(n)
insert	O(1)	O(n)
delete	O(1)	O(n)

We can see:

Hash lookups (getting the value at a given index) are as fast as arrays, and inserts and deletes are as fast as linked lists. So it has a little bit of both
But in the worst case, hash tables are slow in all sorts of operations (for example, if they are all grouped under a list with index 0, the query will behave like a list query).

So for the worst case scenario, we need to:

Lower fill factor: Hash tables use arrays to store data, so you need to count the number of places in the array that are occupied. (For example, if the array is 10 and we fill in three of them, the fill factor is 0.3; If the number filled exactly fills up the length, the filling factor is 1; If 20 are filled in, the filling factor is 2.) When the fill factor is too large, the array length is insufficient and we need to add positions to the hash table. It’s called adjusting length. (Adjust the length of the hash table once the fill factor is greater than 0.7, so you first create a new array that is longer, usually doubling the array.)
Good hash functions: Good hash functions tend to evenly distribute values in arrays, while bad hash functions tend to cluster values, causing a lot of collisions.

So if we want to know the price of a fruit, we can just type in the name of the fruit, and then we can use the hash function to return an index value to find the corresponding price in the array.

Conclusion:

Where error please help correct, thanks.

Reference:

Big Talk Data Structure

Diagram of algorithms