This is the hash table

Hash table Hash

Goal:

basis
Design of hash functions
Hash conflict handling
- Chain address method Separate Chaining
Implement your own hash table
Dynamic spatial processing and complexity of hash tables

1.1 Understanding hash tables

Each character corresponds to an index (hash function), O(1) lookup operation
It is difficult to ensure that each “key” is converted by a hash function to a different “index”.
Hash conflict
- The designer of hash functions is very important
- The more evenly distributed the “index” obtained by the “key” hash function, the better
Hash table fully embodies the classical idea of algorithm design: space for time. A hash table is a balance between time and space

1.2 Design of hash function

Every Object class in Java has an implementation of hashCode() by default.

The integer
- Small positive integers are used directly
- Small range of negative integers offset -100 100- > 0 200
- Large integer
  - Modulus: die a prime number (relating to the mathematical theory, but more discussion) reference: planetmath.org/goodhashtab…
  The test code is as follows:
```
int a = 42;
System.out.println(((Integer)a).hashCode());

int b = -42;
System.out.println(((Integer)b).hashCode());
Copy the code
```
floating-point

Both are 32-bit and 64-bit binary representations, but the computer interprets them as floating-point numbers

To an integer.
```
double c = 3.1415926;
System.out.println(((Double)c).hashCode());
Copy the code
```

String, compound type

To an integer

String d = "java";
System.out.println(d.hashCode());
Copy the code

The principle of
- Consistency: Hash (a) == hash(b) if a == b
- High efficiency: the calculation is efficient and simple
- Uniformity: Hash values are evenly distributed

1.3 Handling of hash conflict

Chain address method
- For the entire hash table, I open m Spaces, and for each space, because of the hash conflicts, I’m essentially storing a linked list
  
  For each position, we say save a linked list, which is actually the lookup table. After previous learning, we can also use the balanced tree structure (TreeMap, TreeSet) to achieve the lookup table.
  
  Before Java8, there was a linked list for each location. Starting with Java8, when hash collisions reached a certain level, each location changed from a linked list to a red-black tree.

TreeMap implementation

The code is as follows:

/ / add
public void add(K key, V value) {
    / / reuse TreeMap
    TreeMap<K, V> map = hashtable[hash(key)];
    if (map.containsKey(key)) {
       map.put(key, value);
    } else{ map.put(key, value); size++; }}/ / delete
public V remove(K key){
    TreeMap<K, V> map = hashtable[hash(key)];
    V ret = null;
    if (map.containsKey(key)){
        ret = map.remove(key);
        size--;
    }
    return ret;
}

/ / change
public void set(K key, V value){
    TreeMap<K,V> map = hashtable[hash(key)];
    if(! map.containsKey(key)){throw new IllegalArgumentException(key + "doesn't exist");
    }
    map.put(key, value);
}

public boolean contains(K key){
    return hashtable[hash(key)].containsKey(key);
}

public V get(K key){
    return hashtable[hash(key)].get(key);
}
Copy the code

Complexity analysis

Suppose there are a total of M addresses. If you put them in a hash table, you get N elements, and when you implement them in a linked list, the time complexity of each address is O(N/M), and when you implement them in a balanced tree, the time complexity of each address is O(logN/M).

How do I get to order one? Dynamic space allocation.

On average, each address carries more elements than a certain level, that is, capacity expansion. N/M >= upperTol
The average capacity of each address is reduced to a certain extent. N/M < lowerTol

The specific implementation code is as follows:

private void resize(int newM){
    TreeMap<K, V>[] newHashTable = new TreeMap[newM];
    for (int i = 0; i<newM; i++){ newHashTable[i] =new TreeMap<>();
    }

    int oldM = M;
    this.M = newM;
    for (int i = 0; i<oldM; i++){ TreeMap<K, V> map = hashtable[i];for(K key:map.keySet()){ newHashTable[hash(key)].put(key, map.get(key)); }}this.hashtable = newHashTable;
}

// Modify the added element
public void add(K key, V value) {
    / / reuse TreeMap
    TreeMap<K, V> map = hashtable[hash(key)];
    if (map.containsKey(key)) {
        map.put(key, value);
    } else {
        map.put(key, value);
        size++;

        if (size >= upperTol * M) {
            resize(2* M); }}}/ / delete
public V remove(K key) {
    TreeMap<K, V> map = hashtable[hash(key)];
    V ret = null;
    if (map.containsKey(key)) {
        ret = map.remove(key);
        size--;

        if (size < lowerTol * M && M/2>=intitCapacity) {
            resize(M / 2); }}return ret;
}
Copy the code

For hash tables, the number of elements increases from N to upperTol*N, the address space doubles, and the average complexity is O(1). Each operation ranges from O(lowerTol) to O(upperTol)–>O(1).

Expand by 2 by M, not prime. Solution: according to planetmath.org/goodhashtab…

/ / prime tables
private final int[] capacity
    = {53.97.193.389.769.1543.3079.6151.12289.24593.49157.98317.196613.393241.786433.1572869.3145739.6291469.12582917.25165843.50331653.100663319.201326611.402653189.805306457.1610612741};

/ / add
public void add(K key, V value) {
    / / reuse TreeMap
    TreeMap<K, V> map = hashtable[hash(key)];
    if (map.containsKey(key)) {
        map.put(key, value);
    } else {
        map.put(key, value);
        size++;

        if (size >= upperTol * M && capacityIndex + 1< capacity.length) { capacityIndex++; resize(capacity[capacityIndex]); }}}/ / delete
public V remove(K key) {
    TreeMap<K, V> map = hashtable[hash(key)];
    V ret = null;
    if (map.containsKey(key)) {
        ret = map.remove(key);
        size--;

        if (size < lowerTol * M && capacityIndex - 1> =0) { capacityIndex--; resize(capacity[capacityIndex]); }}return ret;
}
Copy the code

Conclusion:
- The amortized complexity of hash tables is O(1), sacrificing orderality. The implementation of sets and maps can be linked lists, trees, or hash tables.
- Ordered sets, ordered maps: balanced trees
- Unordered set without mapping: hash table

Hash table Hash

1.1 Understanding hash tables

1.2 Design of hash function

1.3 Handling of hash conflict

Related Posts

Prepare to migrate your Vue application to Vue 3

IE Security series: Script Pioneer (III) – Shellcode in nethorse

Vue A demo of front-end SMS verification