Like attention, no more lost, your support means a lot to me!
🔥 Hi, I’m Chouchou. GitHub · Android-Notebook has been included in this article. Welcome to grow up with Chouchou Peng. (Contact information at GitHub)
preface
- In computer science, there are many application scenarios of hashing algorithms, such as: hash function of hash table, message digest algorithm of message integrity verification, what other application scenarios do you know?
- In this article, I will take you through the basic requirements of hashing algorithms & application scenarios. Please be sure to like and follow if you can help, it really means a lot to me.
series
- The cryptography | truth! What do you think is the Base64 encryption algorithm?”
- Say the cryptography | ready! What is the hash algorithm?”
- The cryptography | sense! From the global to understand the message digest, encryption, signature and digital certificate”
Related articles
- “Computer network | graphic DNS & HTTPDNS principle”
- The Android | mountain, can offend jade! Read an article/v1 / v2 v3 signature mechanism”
directory
1. An overview of the
1.1 Definition of hash function
A Hash algorithm is an algorithm that converts an input of arbitrary length into an output of fixed length. The output is a Hash value.
The hash algorithm must be a compression mapping, i.e., the range will be much smaller than the input range. For example, the output hash value of MD5 is 128 bits, and that of SHA256 is 256 bits.
1.2 Properties of the hash algorithm
There are many hashing algorithms, but they all have the following properties & requirements:
The nature of the | describe |
---|---|
unipolarity | Input data cannot be extrapolated from hashed values |
consistency | For the same input data, the computed hash value is always the same |
High efficiency | The hashing process is as fast and efficient as possible |
randomness | The distribution of hash values in the output range is as random as possible |
Input sensitivity | The calculated hashes of similar data vary greatly |
2. Hash conflicts
2.1 What is hash conflict?
As mentioned in the previous section, the Hash algorithm is a compressed mapping (the output range is much smaller than the input range), so there will definitely be two or more input data mapped to the same Hash value, which is called Hash Collision.
It should be noted that hashing conflicts cannot be completely avoided, which can be well understood by using the pigeon nest principle (also known as: drawer principle). If there are 10 pigeon nests and there are 11 pigeons, no matter how evenly distributed, there must be one pigeon nest with two or more pigeons.
For example, the hash value of the Java string “Aa” conflicts with that of “BB” :
String str1 = "Aa"; String str2 = "BB"; System.out.println(str1.hashCode()); 2112 System.out.println(str2.hashCode()); 2112 Hash conflictCopy the code
Since the hash conflict cannot be completely avoided, there are two main measures to deal with it: reduce the probability & deal with the conflict.
2.2 Reducing the hash conflict probability
The main ideas to reduce the probability of hash conflict are as follows:
- 1. Optimize the hashing algorithm
The randomness of the hash algorithm was mentioned earlier: the distribution of the hash values in the output range should be as random as possible. This is to avoid the phenomenon of “piling up”, where the hashes are concentrated in a region of the output range, which undoubtedly increases the probability of collisions.
- 2. Expand the output range
When the input range is relatively stable, the conflict probability can be reduced by expanding the output range. SHA, for example, has a longer hash value than MD5, resulting in a lower collision probability. HashMap expands when the threshold is reached, essentially expanding the output range.
2.3 Handling Hash Conflicts
In the data structure | hash table is how to avoid conflict of hash?” In, we will discuss solutions in hash tables.
3. Application scenarios of the hash algorithm
Editting…
4. To summarize
- Hash algorithm is an algorithm that converts input of arbitrary length to output of fixed length.
The resources
- The Art of Java Encryption and Decryption (chapters 6, 7, 8, 9). By Dong Liang
- The Beauty of Data Structures and Algorithms (chapters 21 and 22). By Wang Zhengzao, geek Time
- Hashing algorithms – Wikipedia
Recommended reading
- Algorithm | list problem summary
- | back algorithm framework to solve problems
- Operating system | interruption & system call is analysed
- Graphics | new-confucianism. What else do you know about PNG besides lossless compression?
- Data structure | weibo Top 10 hot search is how calculated? (Binary heap)
- Computer network | graphic DNS & HTTPDNS principle
- Android | so-accurately weighed various! Talk about the whole process of loading images
- Android | food tasteless! App Startup may be easier than you think
- Android | enough is enough! How does Glide arrange the life cycle clearly