All the complete code for this series of articles on data structures and Algorithms has been uploaded to Github. If you want the complete code, you can go there directly. If you can, please click Star
- Github.com/Lpyexplore/…
Arrays are a common and frequently used data structure. What are the advantages of arrays? As we all know, when we know the subscript value of an element in the array, we can obtain the corresponding element in the array through the subscript value, which is very fast to obtain the element.
However, arrays also have certain disadvantages. If we do not know the subscript value of an element, but only know that the element is in the array, then we can only search the array linearly, that is, start from the beginning, which is very inefficient. If an array with a length of 10000, We need the 10,000th element, so we have to iterate through the array 10,000 times, which obviously doesn’t make sense.
Therefore, in order to solve the shortcomings of the array, introduced the concept of hash table, hash table in many languages used at the bottom of the very much, this article we will explain the hash table. Because the hash table is a lot of basic knowledge, but the code is very simple, so please be patient to finish reading.
- Public account: Front impression
- Sometimes send book activities, remember to pay attention to ~
- [Interview questions], [front-end required e-book], [data structure and algorithm complete code], [front-end technology communication group]
What is a hash table
Hash tables can make up for some of the disadvantages of arrays, so we can make some changes on the basis of arrays to implement hash tables.
So let’s think about a hash table with a concrete example.
If the library has100,000 booksThe librarians put them randomly on the shelves, and that’s what the array looks likeAt this time the books on the shelf are out of order, suddenly a classmate said I want to findGetting Started with JavaScript
This book, then, they had to start from the first book on the shelf and work their way back until they found it.
It’s like trying to find an element in an array that doesn’t know its subscript, so you have to go through the entire array, and that’s not good.
So someone is going to say, why don’t we just give each book a subscript when we put it on the shelf? Wouldn’t it be convenient to find a book directly by subscript?
When you go to the library to borrow a book, you usually use the computer to look up the number of a book and then go to the designated shelf to look for the book. So when you look up the computer, does the computer have to go through the entire library and find the book you want? This was obviously impossible, so we wanted to develop a data structure that would let the computer know the number of a book when given the name of the book, and that introduced the concept of hash tables
For our convenience let’s first decide on an acceptable array length, say 10Then consider the title of each book as a string, for example"Getting started with Zero JavaScript"
When the string is 15 in length, we get the Unicode encoding for each character separately and use the famousQin Jiushao algorithm
To combine each Unicode encoding into a large number.
Here to introduce the Qin Jiushao algorithm. Qin Jiushao algorithm is a polynomial simplification algorithm proposed by Qin Jiushao, a mathematician in the Southern Song Dynasty of China, which is called Horner algorithm in the West.
This is a sum of unary n-degree polynomials, and we can see that in this formula, there are n additions and (n+1) times n over 2 multiplicationsThe Qin Jiushao algorithm is to extract the common factor of this formula, and finally the formula into this And when we look at the final result, we can see that this simplifies to n times of multiplication and n times of addition.
This simplified process also provides great convenience for computer to deal with polynomial evaluation problem and greatly improves the efficiency of evaluation. This is why we use The Qin Jiushao algorithm to combine the Unicode encoding of strings into one large number.
By looking at the Unicode encoding for every string in JavaScript, They are 38646, 22522, 30784, 20837, 38376, 74, 97, 118, 97, 83, 99, 114, 105, 112, 116 respectively
Then we add the Unicode code to qinjiusao’s algorithm and get 3.539723283210537e+26, which is a very, very, very big number. Why do we get that big number? Because didn’t we just set the length of the array to be 10? So we take the remainder of this number divided by 10, and we use that remainder as the subscript of that element in the array.
Why do you do that? Since the length of the array is 10, the subscript value of the array ranges from 0 to 9. If a number is mod by 10, does the remainder range from 0 to 9? Doesn’t that correspond to the subscript of the array?
At this point, I will explain again, why through qin Jiushao algorithm to get such a large number, and then take the remainder to get the subscript value? This is not redundant, why not directly add the Unicode encoding of each character and then directly mod, so that you do not need to do multiplication, all add, calculation efficiency is higher. So let’s look at another example to see why we need to get a large number first:
First, the first string is BCD. Each character has its Unicode encoding of 98, 99, and 100, which add up to 297,297%10 = 7, so the subscript value of the string BCD is 7. The second string is ace. The Unicode encoding for each character is 97, 99, and 101, which add up to 297, so ace is also 7.
At this point, the subscript values of the two strings are repeated, a phenomenon known in hash tables as conflict, which I’ll discuss in more detail below. Then again, it is because the addition of a simple take over will be very easy to appear the phenomenon of conflict, so we just choose chiu-shao algorithm is used to give each string rules under the corresponding values, this can largely ensures each element values do not repeat, of course also can have a chance to repeat, this is inevitable.
The above process of converting a string to an array length subscript is called hashing, and the code for hashing is usually wrapped in a function called a hash function
Methods according to the above, the string for the title of each book hash, can give each book under a coordinate values and stored in the array, then when you want to consult a book of coding, you enter the title of the book, after the computer to get, need only through the processing of hash function, can obtain the values of the book, And then instantly get the number of the book.
I believe you should have a certain understanding of the hash table, if you don’t understand, we will slowly explain.
Second, the advantages and disadvantages of hash table
In the description of the hash table just now, we must be able to see the advantages and disadvantages of the hash table, here I give you a summary of it ~
(1) Advantages
First, the advantages of hash tables:
- No matter how much data there is, it’s incredibly fast to process
- Able to proceed quickly
Insert modified element
、Remove elements
、Look for the element
Operations such as - Simple code (actually just need to write hash function, the code is very simple)
(2) Disadvantages
Then I’ll talk about the disadvantages of hash tables:
- The data in a hash table is not ordered
- Data cannot be duplicated
Three, conflict
I mentioned conflict, which means that after the hash, several elements have the same subscript value, and this is called conflict. So when the subscript values of two elements conflict, does the latter element replace the former element? Of course not!
So how to solve the phenomenon of conflict? Generally, there are two methods, namely, the zipper method (chain address method) and open address method
(1) Zipper method
This is a very common method of conflict resolution. We still take the above example, 10 books by hash to the length of the array of 10 stocks after the hard to avoid has a few books under the values are the same, then we can be both the values of the same element into a single array, and then the array stored in their original position of the array subscript, as shown in figureAssuming thatBook 1
和 The book 2
The hash subscript value is 3, so we can put an array in the place of the original array subscript 3 to hold both books. Therefore, no matter which book is queried, the computer gets the subscript value of 3, and then iterates through the array at that location to get the book it wants.
This is the first way to resolve conflicts, but it’s still a matter of deciding whether the array size is appropriate, as I’ll explain later.
(2) Open address method
This method is simply to find a blank place to insert data when the underlying values of an element conflict. Assume that the positions with the current subscript value of 1 and 3 have already been insertedBook 3
和 The book 5
When willThe book 6
I hash it, and I find that its subscript value is also 1Book 3
There’s a conflict, so at this pointThe book 6
You can find the next empty uninserted element to insert, as shown in the figureIn fact, when conflict occurs, there are three ways to find the empty space, respectivelyLinear detection 、Second detection 、Then the hash method
1. Linear detection
As the name implies, linear detection means that when two elements conflict, the current index +1 is used to see if the position is empty, if so, data is inserted, otherwise +1 is used, and so on… Until the data location is inserted.
However, this method has a disadvantage, that is, when the array has a long sequence of inserted data, linear detection is not so efficient, because each detection step is 1, so the inserted data of each segment must be detected once, this phenomenon is calledgather. The figure below is a typical clustering phenomenon Eight books
The subscript of is 1, andBook 3
Collision, and then we do linear detection, and then we go throughBooks 6, 5, 1, 7
I didn’t find any blank space to insert, and I didn’t find any blank space to insert until the end, which is pretty bad, so we can do thatSecond detectionTo relieve thegatherThis phenomenon.
2. Secondary detection
On the basis of linear detection, the step size of each detection is changed to the current subscript index + 1², index + 2², index + 3²…… Until you find a blank place to insert an element
Let’s take an example to understand secondary detection
Suppose it is now depositedBook 3
、The book 5
、Book 7
As shown in figureAnd then at this point you have to deposit oneThe book 6
, the subscript value obtained by hashing is 2, andThe book 5
Conflict, so move it back from index 21 squared
Location, but there is already data stored in that location, as shown in the following GIFSo we’re moving backwards from index 22 squared
At this time, it is found that the moved position also has data, so data still cannot be inserted, as shown in the following GIFTherefore, we continue to move backwards from index 23 squared
At this time, it is found that there is a vacant position in the position after the move, so data is directly inserted here, and a secondary detection process is completed, as shown in the following DYNAMIC diagramAnd we can see,Second detectionTo a certain extentLinear detectionCaused by thegatherProblem, but it creates a kind of aggregation in a different way, like1 squared
、2 squared
、3 squared
…N squared
The aggregation of. So it’s still a little bit bad.
3. Rehash
The rehash method is to hash the value we passed in again, get a new step step, then probe according to this step, find the first empty position to insert data. This solves the clustering problem to a large extent.
Since it is necessary to hash again to obtain a probe step number, the processing process of this hash must be different from the processing process of the first hash, so as to confirm an appropriate search step size and improve the search efficiency.
Here, we don’t have to worry about writing a different hash function, but let’s look at a generally accepted hash function: step = constant – (key % constant)
Where constant is a self-definable prime constant that is smaller than the size of the array. Key is the value of the first hash.
And then we’re going to use the step size of this function to search for empty places to insert, so I’m not going to do too much demonstration here.
Expansion and reduction of hash table
Before we look at the expansion of the hash table, let’s look at a concept called the fill factor, which represents the ratio of the number of data in the hash table to the length of the hash table. It determines the amount of time required to access the data in the hash table.
When we use the first method of solution to the conflict, zipper, minimum fill factor is 0, the maximum is infinite, that is because the method is through the insert in a certain position in the array used to store an array of conflicting elements, therefore, whenever possible, the length of the hash table is small, then the data is stored in the built-in array, So the filling factor can be infinite.
That when we use the second kind of solution to the conflict – open address method, minimum fill factor is 0, the biggest only to 1, this is because the opening address the implementation of the principle is to find a hash table hollow position insert element, so that no amount of data in the hash table is greater than the length of the hash table, to fill factor is 1.
Compare a hash table to a classroom, if the classroom is full of people, and then ask you to find your friends, that dense, is not particularly hard to find, may be dazzling; But if the room is only two-thirds full or half full, the crowd doesn’t look so dense, and it might be easier to find your friends.
Because of this situation, we can expand the length of the hash table according to the size of the fill factor at appropriate time, so as to reduce the size of the fill factor and reduce the time consumed by our data access.
Here, we take the padding factor equal to 0.75 as the critical point for expansion of the hash table, and the expansion will be realized when the hash table is encapsulated later.
Of course, if the padding factor is too small, it is not appropriate, so we will add the capacity reduction function where appropriate, that is, the padding factor is 0.25 as the critical point of reducing the capacity of the hash table.
Five, hash table method
As always, before we encapsulate a hash table, let’s take a look at some of the common methods for hash tables
methods | meaning |
---|---|
put() | Inserts or modifies data into a hash table |
get() | Gets some data in the hash table |
del() | Delete data from a hash table |
isEmpty() | Determines whether the hash table is empty |
size() | Returns the number of elements in the hash table |
resize() | Change the hash table capacity and put the data in the right place |
isPrime() | Determine whether a number is prime or not |
toPrime() | Gets the nearest prime number to a number |
Implement hash table with code
Premise:
- In this paper, the chain address method is used to solve the conflict problem
- Where constants are involved, primes are used, such as hashing table capacity, constants of Horner algorithm, etc. Because in number theory, using prime numbers makes it possible to distribute data evenly in a hash table
Create a constructor
Start by creating a large constructor to hold some of the attributes and methods of the hash table.
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
}
Copy the code
Because we are through the array to achieve the hash table, so set the property storage to store data; Then the attribute count is defined to count the number of data in the hash table, which can be used to calculate the fill factor later. Finally, the attribute length is defined to set the initial length of the hash table to prime number 7
(2) Encapsulate the hash function
At the beginning of the article, I used Horner algorithm to explain the process of hashing, so when we encapsulate the hash function, we also use the final simplification structure of Horner algorithm to achieve
I’m going to put up a simplification of Horner’s algorithm just for you to see
P(n) = a0a_0a0 + x( a1a_1a1+x( a2a_2a2 +… + x(ana_nan − 1 + x ana_nan)
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
}
Copy the code
I’m just going to pick a prime number 37 for horner’s algorithm, and you can pick any prime number that’s a little bit bigger
The hash function takes two arguments. The first argument is STR, the key of the data we pass in later. The second argument is size, which is the length of the hash table, so we can call this. Length directly
(3) Implement the PUT () method (without capacity expansion function)
The put() method simply inserts or modifies data into a hash table.
The method takes two arguments, the first of which is key; The second parameter is value. That’s the equivalent of passing in a key-value pair
Implementation idea:
- With the hash function, I’m going to
key
Hash, get an indexindex
- Judge the hash table
storage
Index of an arrayindex
If no, create an empty array at that locationarr
And pack the key-value pairs we passed in into a new arrayarr
中 - If there is data, it iterates through each element of the array on the index, comparing the values of each element
key
Whether with our incomingkey
Equal, if there is an equal value in the query, use the value we passed invalue
Replace the element in the queryvalue
And that’s what happenedModify the dataThe function of the - If no equal value is found, we simply package our key-value pairs into a new array and store them in the hash table
index
Index in the array, at this pointthis.count + 1
In order to facilitate your understanding, I use a GIF to demonstrate the implementation process of this method
The first is the insert data operationNext comes the modify data operation
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Insert or modify data
HashTable.prototype.put = function (key, value) {
// 1. Hash to get the subscript value
let index = this.hashFunc(key, this.length)
let current = this.storage[index]
// 2. Check whether the subscript value has data
// 2.1 No data
if(! current) {this.storage[index] = [[key, value]]
this.count ++
return;
}
// 2.2 Data available
// 3. Iterate through the array on the corresponding index
for (let i = 0; i < current.length; i ++) {
// 3.1 The same data already exists
if(current[i][0] === key) {
current[i][1] = value
return; }}If the same data does not exist, add data directly
current.push([key, value])
this.count ++
}
}
Copy the code
Let’s use that method
let ht = new HashTable()
// Execute the put() method six times
ht.put('abc'.'123')
ht.put('hgf'.'124')
ht.put('wds'.'125')
ht.put('wer'.'126')
ht.put('kgl'.'127')
ht.put('kmg'.'128')
// Check the number of data in the hash table
console.log(ht.count) / / 6
// View the internal structure of the hash table through the storage property
console.log(ht.storage)
/ * storage print results [[[' WDS ', '125'], [' KGL ', '127'], [' KMG ', '128']], [[' wer ', '126']], < 1 empty item >, [ [ 'hgf', '124' ] ], [ [ 'abc', '123' ] ] ] */
Copy the code
The inside of my hash table looks like this
(4) Implement get() method
The get() method is used to query data in a hash table. This method takes a single parameter, the key used for the query
Implementation idea:
- With the hash function, I’m going to
key
Hash, get an indexindex
- Judge the hash table
storage
Index of an arrayindex
If no, returnfalse
- If there is data, it iterates through each element of the array on the index, comparing the values of each element
key
Whether with our incomingkey
Equal. If any value is found equal, the value is returnedvalue
- If no data is available, the system returns
false
The idea and the code are relatively simple, let’s look at the code directly
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Get data
HashTable.prototype.get = function (key) {
// 1. Obtain the corresponding subscript value
let index = this.hashFunc(key, this.length)
let current = this.storage[index]
// 2. Check whether the subscript value has data
// 2.1 If no data exists at this subscript value, the search fails
if(! current) {return false
}
// 2.2 There is data at the subscript value
// 3. Perform traversal search
for(let i in current) {
// 3.1 Find corresponding data and return value
if(current[i][0] === key) {
return current[i][1]}}If no data is found, return false
return false}}Copy the code
Let’s use the following method
let ht = new HashTable()
// Execute the put() method six times
ht.put('abc'.'123')
ht.put('hgf'.'124')
ht.put('wds'.'125')
ht.put('wer'.'126')
ht.put('kgl'.'127')
ht.put('kmg'.'128')
// Execute the get() method to get the value with key 'HGF'
console.log(ht.get('hgf')) / / 124
Copy the code
(5) Implement del() method (without capacity reduction function)
The del() method deletes some data from the hash table. The method takes a parameter key
Implementation idea:
- With the hash function, I’m going to
key
Hash, get an indexindex
- Judge the hash table
storage
Index of an arrayindex
If no, returnfalse
“, indicating that the deletion fails - If there is data, it iterates through each element of the array on the index, comparing the values of each element
key
Whether with our incomingkey
Equal. If an equal value is found, the value is deletedthis.count --
And returns the value of the deleted elementvalue
value - Returns if no equal value is found
false
“, indicating that the deletion fails
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Delete data
HashTable.prototype.del = function (key) {
// 1. Obtain the corresponding subscript value
let index = this.hashFunc(key, this.length)
let current = this.storage[index]
// 2. Check whether there is data in the index position
// 2.1 There is no data at this subscript value, return false, delete failed
if(! current) {return false
}
// 2.2 There is data at the subscript value
// 3. Search for the corresponding data
for (let i in current) {
let inner = current[i]
// 3.1 Delete the corresponding data
if(inner[0] === key) {
current.splice(i, 1)
this.count --
return inner[1]}}// 3.2 If no corresponding data is found, the deletion fails and false is returned
return false}}Copy the code
Let’s use that method
let ht = new HashTable()
// Execute the put() method six times
ht.put('abc'.'123')
ht.put('hgf'.'124')
ht.put('wds'.'125')
ht.put('wer'.'126')
ht.put('kgl'.'127')
ht.put('kmg'.'128')
// Delete the element with key 'HGF'
ht.del('hgf') // Delete successfully, return '124'
// Delete the element whose key is' PPP '
ht.del('ppp') // Delete failed, return false
// check the hash table internal structure
console.log(ht.storage)
/ * storage print results [[[' WDS ', '125'], [' KGL ', '127'], [' KMG ', '128']], [[' wer ', '126']], < 1 empty item >, []. [ [ 'abc', '123' ] ] ] */
Copy the code
So the hash table looks like this
(6) Implement isEmpty()
The isEmpty() method is used to determine whether the hash table isEmpty. This method does not need to pass parameters
The idea of this method is relatively simple, directly determine whether the attribute count is 0
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Determine whether the hash table is empty
HashTable.prototype.isEmpty = function () {
return this.count === 0}}Copy the code
So let’s use that method
let ht = new HashTable()
console.log(ht.isEmpty()) // false, hash table is empty
ht.put('abc'.'123')
console.log(ht.isEmpty()) // true, hash table is not empty
Copy the code
(7) Implement the size() method
The size() method is used to return the number of values in the hash table. This method also does not need to pass parameters
This method is also very simple, directly return the attribute count
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Return the number of elements in the hash table
HashTable.prototype.size = function () {
return this.count
}
}
Copy the code
So let’s use that method
let ht = new HashTable()
console.log(ht.size()) // 0, no data in hash table
ht.put('abc'.'123')
console.log(ht.size()) // 1, hash table contains one data
Copy the code
(8) Implement the resize() method
The resize() method is used to change the size of the hash table. If the padding factor is too large, we expand it. When the fill factor is small, we increase its capacity. The method takes a parameter, newLength, that represents the capacity of the new hash table.
Implementation idea:
- Put the original property
storage
Assign to a new variableoldStorage
And then we create a new empty array and assign it tostorage
And will parameternewLength
Assign a value to a propertylength
- traverse
oldStorage
According to the new size of the hash tablenewLength
Rehash all data in the original hash table and insert data
This method is hard to demonstrate in giFs, so just to make sense of it, let’s just do it in code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Change the hash table capacity
HashTable.prototype.resize = function(newLength) {
// 1. Assign the old hash table to the new variable
let oldStorage = this.storage
// 2. Create a new empty array as a new hash table container
this.storage = []
// 3. Modify the hash table capacity
this.length = newLength
// 4. Run through the old hash table
for(let i = 0; i < oldStorage.length; i++) {
let box = oldStorage[i]
// 4.1 There is no data in an index position
if(box === null) {
continue;
}
// 4.2 There is data on an index
for(let j = 0; j < box.length; j++) {
let inner_box = box[j]
4.2.1 Insert data into a new hash table
this.put(inner_box[0], inner_box[1])}}}}Copy the code
This method won’t be demonstrated too much here, but will be used later in the rewrite of the put() and del() methods
(9) Implement isPrime() method
The isPrime() method is used to determine whether a number isPrime, so you only need to accept a single number as an argument.
Since we want to achieve automatic expansion and reduction of the hash table, each time the capacity changes, we need to determine whether the new capacity is a prime number, in order to ensure that the data in the hash table is evenly distributed, so we still need to encapsulate this method.
But before we do that, let’s just remind ourselves that prime numbers are only divisible by 1 and itself, so let’s look at the number 16, which is obviously not a prime number, so let’s see what numbers are divisible by
On the left | right | Is equal to the |
---|---|---|
1 | 16 | 16 |
2 | 8 | 16 |
4 | 4 | 16 |
8 | 2 | 16 |
16 | 1 | 16 |
It’s pretty obvious that as long as a number is divisible, it must be greater than or equal to the arithmetic square root of that number; The other number must be less than or equal to the arithmetic square root of that number
Therefore, when we determine whether a number is prime, we only need to judge whether the number can be divisible one by one starting from 2 until the arithmetic square root of the number
So why don’t we take a look at how the code works
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Check whether it is prime
HashTable.prototype.isPrime = function(number) {
// 1. Take the arithmetic square root and round it
let sqrt = Math.floor(Math.sqrt(number))
// 2. Iterate from 2 to square root
for(let i = 2; i <= sqrt; i++) {
// if 2.1 is divisible, return false
if(number % i === 0) return false;
}
// 2.2 cannot be divisible, returns true
return true}}Copy the code
Let’s briefly verify this method
let ht = new HashTable()
console.log(ht.isPrime(3)); // true
console.log(ht.isPrime(4)); // false
console.log(ht.isPrime(16)); // false
console.log(ht.isPrime(11)); // true
Copy the code
(10) Implement toPrime() method
The toPrime() method takes the nearest prime number and returns it, so you only need to take a single number as an argument.
When we realize capacity expansion or reduction, we will simply multiply or divide by 2 at the beginning, which is difficult to ensure that the obtained number is prime. Therefore, we need to encapsulate such a method to transform the transformed value into a prime number before using it.
The idea is very simple, just keep +1 to find prime numbers.
Let’s look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Retrieves the nearest geological number and returns it
HashTable.prototype.toPrime = function(number) {
while(!this.isPrime(number)) {
number ++
}
return number
}
}
Copy the code
Let’s briefly verify this method
let ht = new HashTable()
console.log(ht.toPrime(3)) / / 4
console.log(ht.toPrime(32)); / / 37
Copy the code
(11) Add capacity expansion to the put() method
When do I need to expand my hash table? Therefore, we will add expansion judgment in the put() method. As mentioned above, the padding factor equal to 0.75 is the critical point for expansion, i.e. when the padding factor is greater than 0.75, we will expand the hash table
Without further ado, we’ll rewrite this directly in the put() method encapsulated above
Implementation idea:
- in
this.count ++
Then, judge the size of the filling factor, i.ethis.count / this.length
If more than0.75
If less than0.75
, do not do any processing - If more than
0.75
, first get a number twice the capacity of the original hash tablenumber
Call again,this.toPrime
Method to obtain a separationnumber
The nearest prime numberprime
- The last call
this.resize
Method, and willprime
Passed in as a parameter to complete capacity expansion
Let’s just look at the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Insert or modify data
HashTable.prototype.put = function (key, value) {
// 1. Hash to get the subscript value
let index = this.hashFunc(key, this.length)
let current = this.storage[index]
// 2. Check whether the subscript value has data
// 2.1 No data
if(! current) {this.storage[index] = [[key, value]]
this.count ++
return;
}
// 2.2 Data available
// 3. Iterate through the array on the corresponding index
for (let i = 0; i < current.length; i ++) {
// 3.1 The same data already exists
if(current[i][0] === key) {
current[i][1] = value
return; }}If the same data does not exist, add data directly
current.push([key, value])
this.count ++
// 4. Determine whether to expand the capacity
if(this.count / this.length >= 0.75) {
// 4.1 Double the hash table capacity
let newLength = this.length * 2
// 4.2 Obtaining the prime number capacity
newLength = this.toPrime(newLength)
/ / 4.3 expansion
this.resize(newLength)
}
}
}
Copy the code
(12) Add capacity reduction function to del() method
Similarly, in our del() method, data reduction is involved, that is, this.count –, then we need to consider whether the capacity of the hash table needs to be reduced, that is, whether the fill factor is less than 0.25. If it is less than and the capacity of the hash table is greater than 7, the capacity of the hash table should be reduced. Otherwise, no processing is done
Here just to defend why hash table capacity is more than 7, because the capacity reduction, our capacity to be divided by 2, but the capacity of the hash table is not convenient too small is too small, so I set myself a capacity is the lower limit of 7, which means that when a hash table capacity is less than or equal to 7, even if the fill factor is less than 0.25, also need not to reduce capacity
Implementation idea:
- in
this.count --
Then, judge the size of the filling factor, i.ethis.count / this.length
Is less than0.25
, if more than0.25
, do not do any processing - If less than
0.25
And the hash table capacity is greater than7
, then get a number with half of the original hash table capacitynumber
Call again,this.toPrime
Method to obtain a separationnumber
The nearest prime numberprime
- The last call
this.resize
Method, and willprime
Passed in as a parameter to complete capacity reduction
Let’s go straight to the code
function HashTable() {
/ / property
// Used to store data
this.storage = []
// Count the number of data in the hash table
this.count = 0
// Set the initial length of the hash table
this.length = 7
// Encapsulate the hash function
HashTable.prototype.hashFunc = function (str, size) {
let hashCode = 0
// Take a large number
for (let i = 0; i < str.length; i ++) {
hashCode = 37 * hashCode + str.charCodeAt(i)
}
/ / to take over
return hashCode % size
}
// Delete data
HashTable.prototype.del = function (key) {
// 1. Obtain the corresponding subscript value
let index = this.hashFunc(key, this.length)
let current = this.storage[index]
// 2. Check whether there is data in the index position
// 2.1 There is no data at this subscript value, return false, delete failed
if(! current) {return false
}
// 2.2 There is data at the subscript value
// 3. Search for the corresponding data
for (let i in current) {
let inner = current[i]
// 3.1 Delete the corresponding data
if(inner[0] === key) {
current.splice(i, 1)
this.count --
// Determine whether capacity needs to be reduced
if(this.count / this.length < 0.25 && this.length > 7) {
// Double the size of the hash table
let number = Math.floor(this.length / 2)
// Get the prime number capacity
number = this.toPrime(number)
/ / reduced capacity
this.resize(number)
}
return inner[1]}}// 3.2 If no corresponding data is found, the deletion fails and false is returned
return false}}Copy the code
Vii. Concluding remarks
So that’s it for hash tables, and hopefully you have a better understanding of hash tables. In the next article I’ll look at tree structures.
Then you can pay attention to my wechat official account: Front Impressions. After this column is finished, I will put notes on each data structure and algorithm on my official account, where you can go to get them.
Or you can go to my Github to get a Star
- Github.com/Lpyexplore/…
You’re not gonna give me a “like” when you see this?