The hash function

Hash function (y = f(x))

  1. The input field is infinite; for example, a string can be infinitely long; the output field is infinite
  2. Same input, same output, no random component
  3. Different inputs can also have the same output, called a hash collision, but the probability is so low that it can equal the probability of lightning strikes
  4. The output result of hash function is uniform and discrete. For example, even if the input is regular, the hash function can also make the output be discrete and irregular, and the output result is uniformly filled in the output field. If you print each value of the field %m, the results will be evenly distributed from 0 to m-1 after %

Related applications – Storage:

Suppose you only have 1 gigabyte of memory, and you have to figure out which of the 0 to 2^32 -1 numbers, which is about 4 billion, occurs the most

Idea 1

Use HashMap to record the value of each number and the corresponding occurrence times

  • Each key(int) and its corresponding value(int) is at least 4 * 2 = 8 bytes, so in the worst case, 3.2G space is required, so it does not meet the requirements.
  • 1 Gigabyte (G)=1073741824 bytes (B)– Approximately 1 billion bytes
Idea 2
  • Using the hash function, you can get 4 billion hash values, and then for each hash value %100, then due to the nature of the hash function, the number of different numbers in files 0 to 99 is about the same, so each small file memory is 0.32 GB, each file extract the largest value after release, Then compare which of the 100 is the biggest.
  • The process is as follows:
    1. So we loop through each of these numbers, and we plug it into the hash function to get the hash value
    2. We hash the value %100 with the value 0. Key is the number and value is the number of occurrences
    3. The loop is complete, get the value of the maximum number of file 0, release the space of file 0
    4. Go back to step 1 and start again. At step 2, save the value 0 + 1 = 1 after %100 into the hash table. (In turn)

Hash table implementation

  • The value of a hash table is O(1), but it is not small. In theory, it is O(logN) : Because at worst you can only string two strings on each chain, and then you can only string two strings on each chain, and that’s because all the N’s have to be recalculated, so it’s O(N logN) time, where N is the number of numbers, and logN is the number of increments. For all N, the time complexity of a single operation is O(logN), because in practice, a chain will have longer strings, so O(log base k length N) is close to O(1). For Java, it can also be expanded when the online time is not used, which saves more time.
  • When you store a number in a chain, you compute the hash value by the hash function, and the hash value %(the length of the initial array) tells you where to put it. The same goes for finding out if a value exists. So much like addressing, the time complexity depends on the length of the chain and the number of expansions.
  • We’ll start with a brief introduction to the internal storage of a HashMap. As we know, Map is used to store key-value data. A pair is defined as Entry in the interface definition of Map, and Entry interface is implemented in HashMap. A HashMap maintains an array of entries internally. When a new element is put, the array index is calculated based on the hash value of the key. Each element of the array is a head pointer to a linked list used to store entries with the same subscript.

Hash table related operations

package class01; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map.Entry; public class Code01_HashMap { public static void main(String[] args) { HashMap<String, String> map = new HashMap<>(); map.put("zuo", "31"); System.out.println(map.containsKey("zuo")); //true System.out.println(map.containsKey("chengyun")); //false System.out.println("========================="); System.out.println(map.get("zuo")); //31 System.out.println(map.get("chengyun")); //null System.out.println("========================="); System.out.println(map.isEmpty()); //false System.out.println(map.size()); //1 System.out.println("========================="); System.out.println(map.remove("zuo")); System.out.println(map.containskey ("zuo")); //false System.out.println(map.get("zuo")); //null System.out.println(map.isEmpty()); //true System.out.println(map.size()); //0 System.out.println("========================="); map.put("zuo", "31"); System.out.println(map.get("zuo")); //31 map.put("zuo", "32"); System.out.println(map.get("zuo")); //32 System.out.println("========================="); map.put("zuo", "31"); map.put("cheng", "32"); map.put("yun", "33"); map.put("yun2", "34"); map.put("yun3", "35"); For (String key: map.keyset ()) {// Get the set of keys system.out.println (key); / / yun zuo yun2 yun3 cheng} System. No order out. The println (" = = = = = = = = = = = = = = = = = = = = = = = = = "); For (String values: map.values()) {system.out.println (values); / / 33 and 34 35 32 and corresponding to the above, the principle of reason and its relevant, / / seems no order, is through the hash function to calculate the hash value stored} System. Out. The println (" = = = = = = = = = = = = = = = = = = = = = = = = = "); map.clear(); map.put("zuo", "31"); map.put("cheng", "32"); map.put("yun", "33"); map.put("yun2", "34"); map.put("yun3", "35"); //Entry is used to store a key-value pair in a Map. A Map is actually a collection of multiple entries. // Entry<key,value> = Map<key,value> Map.entryset ()) {String key = entry.getKey(); String value = entry.getValue(); System.out.println(key + "," + value); 31 yun2 / / yun, zuo, 33, 34 yun3, 35 cheng, 32 disorderly} System. Out. The println (" = = = = = = = = = = = = = = = = = = = = = = = = = "); map.clear(); map.put("A", "1"); map.put("B", "2"); map.put("C", "1"); map.put("D", "3"); map.put("E", "1"); map.put("F", "4"); map.put("G", "2"); //Entry is used to store a key-value pair in a Map. A Map is actually a collection of multiple entries. // Entry<key,value> = Map<key,value> Map.entryset ()) {String key = entry.getKey(); String value = entry.getValue(); System.out.println(key + "," + value); // A,1 B,2 C,1 D,3 E,1 F,4 G,2 Only with hash function} System. Out. The println (" = = = = = = = = = = = = = = = = = = = = = = = = = "); // You can not remove item in map when you use the iterator of map // for(Entry<String,String> Entry:  map.entrySet()){ // if(! entry.getValue().equals("1")){ // map.remove(entry.getKey()); // if you want to remove items, collect them first, then remove them by // this way. List<String> removeKeys = new ArrayList<String>(); for (Entry<String, String> entry : map.entrySet()) { if (! entry.getValue().equals("1")) { removeKeys.add(entry.getKey()); } } for (String removeKey : removeKeys) { map.remove(removeKey); } for (Entry<String, String> entry : map.entrySet()) { String key = entry.getKey(); String value = entry.getValue(); System.out.println(key + "," + value); //A,1 C,1 E,1 } System.out.println("========================="); }}Copy the code

Related topics – Implementation structures

Insert (key): adds a key to the structure without adding a delete(key): removes a key from the structure. GetRandom (): randomly returns any key in the structure with equal probability.

Insert, delete, and getRandom methods are O(1)

Public class test1{public static class Pool<K>{// Create two maps. Private HashMap<K, Integer> keyIndexMap; private HashMap<Integer, K> indexKeyMap; private int size; public Pool(){ this.keyIndexMap = new HashMap<K, Integer>(); this.indexKeyMap = new HashMap<Integer, K>(); this.size = 0; } public void insert(K key){ if (! this.keyIndexMap.containsKey(key)){ this.keyIndexMap.put(key, this.size); this.indexKeyMap.put(this.size++, key); } } public void delete(K key){ if (this.keyIndexMap.containsKey(key)); Int deleteIndex = this.keyIndexmap.get (key); int lastIndex = --this.size; LastKey = this.indexKeymap. get(lastIndex); Keyindexmap. put(lastKey, deleteIndex); this.keyIndexMap.put(lastKey, deleteIndex); this.indexKeyMap.put(deleteIndex, lastKey); This.keyindexmap. remove(key); // Remove the deleted key this.keyIndexmap. remove(key); // remove the lastIndex pair in another Map this.indexkeymap.remove (lastIndex); } public K getRandom(){ if (this.size == 0){ return null; } int randomIndex = (int) (Math.random() * this.size); return this.indexKeyMap.get(randomIndex); } } public static void main(String[] args){ Pool<String> pool = new Pool<>(); pool.insert("zuo"); pool.insert("cheng"); pool.insert("yun"); System.out.println(pool.getRandom()); System.out.println(pool.getRandom()); System.out.println(pool.getRandom()); System.out.println(pool.getRandom()); }}Copy the code