Earthquake gaogang, a school xi Shan ancient show;

The door to the sea, three rivers water flow;

Yes, that’s the code for the Meeting of the Deer and the Cauldron.

Why would heaven and Earth need a code word?

Suppose that the incense master of The Red Fire Hall of the Society of Heaven and Earth sent a very important secret letter to the incense master of the Aomutang Hall, Trinket, from Yangzhou, the former capital, we can abstract this matter as the following figure:

At the heart of this is gang member – A handing over important secret letters to gang member – B. Assuming that party A and PARTY B do not know each other and have never met, how would gang member A give the confidential letter to gang member B, instead of giving it to the wrong person — another gang member — Ding?

In the historical practice, there must have been such a loss, so the Heaven and Earth association adopted the way of joint code to ensure that A and B were members of the same gang, which came to be:

Earthquake gaogang, a school xi Shan ancient show;

The door to the sea, three rivers water flow;

The code is only known to gang members, and it’s not divulged. A, b both sides meet by the gang members – a say the earthquake gaogang, a school of Xi Shan qian Gu show, gang members – B heard after the next must be a door toward the sea, three rivers water flow forever. If gang member B doesn’t know what the next sentence is, or if it’s bullshit, then gang member A can decide he’s not a handler, but an impostor.

In the same way, gang member B wants to hear gang member A say earthquake Gaogang, a school brook mountain forever show. Otherwise the gang member – A is impersonating, and will probably give the fake secret letter to Trinket aokitang.

It looks like this in the abstract:

So the question is, is there a need for a code between Client and Server?

The answer is yes!

Client is like gang member – A, Server is like gang member – B, and their secret letters are likely to be taken or forged by other gang members – ding. Since there are joint codes in heaven and earth, what is used between Client and Server to ensure that messages are sent first rather than being intercepted and forged?

That’s right, signature verification!

Signature verification is one of the data protection methods widely used in THE field of IT technology. IT can effectively prevent the receiver from treating the tampered or forged messages as normal messages.

It should be noted that ⚠️ prevents the message receiver from treating the tampered or forged message as a normal message, rather than preventing the message receiver from receiving a fake message. In fact, the interface cannot determine whether the message is true or false at the moment it receives the message. This is very important not to be confused.

Suppose Client wants to send the important secret letter of assassinating Oboi to Server on 5th of next month, the abstract picture is as follows:

At this time, if the event of impersonation occurs, what will be the impact:

Ding, another gang member, forged the assassination date of Oboi from 5th to 6th after receiving the information from Client, resulting in the assassination date received by Server is 6th. In this way, the assassination of Oboi by the inside and the outside would become a delay, a long-planned assassination with a high probability of failure and considerable damage.

We use signature verification to improve message delivery and verification. Here, we can simply understand signature verification as the operation and encryption of certain rules on the basis of the original message, and finally send the encryption result in the message together. After receiving the message, the message receiver performs the operation and encryption according to the same rules. Compare the encrypted value obtained by oneself with the encrypted value passed over. If the two values are the same, it means that the message is not intercepted and forged. Otherwise, it can be judged that the message is intercepted and forged.

Here we through actual examples from the website to deepen understanding and open www.porters.vip/verify/sign… , the webpage is as follows:

Click the yellow button in the picture — click to view the details, and the content of the webpage will change, as shown in the picture below:

To observe and analyze, we invoke the browser’s built-in developer tools (F12 or Command+Option+I) and switch to the Network panel. Refresh the page and click the yellow button again. After clicking, we will see a number of Network requests in the Network panel, including a request with a long Name. Click on it.

After clicking it, the Network panel will be divided into left and right columns. The left side is still the request record, and the right side is the request details of the selected request record. The selected network request information is shown below:

We’ve learned from the General and the Query String Parameters in this to www.porters.vip/verify/sign… The request sent by the interface carries four parameters: Actions, Tim, Randstr and sign. Here, the sign field and its corresponding value are the signatures in the signature verification, that is, the values after certain regular operation and encryption.

Without sign, the Client only sends actions, Tim, and Randstr to the Server, then the gang member -Ding can easily forge a message to send to the Server, for example:

After receiving the message, the Server treats it as a normal message, and it is impossible to determine who sent the message, whether it was intercepted or tampered with. Which begs the question: What role does Sign play here?

Assume that the operation rule of sign is:

sign = MD5(str(actions + 10086) + str(tim) + randstr * 3)
Copy the code

The operation rule set in the Client is equivalent to the first half of the joint code, and the Server takes actions, Tim, randstr and sign, calculates the value of new_sign with the same operation rule. Finally, check whether the sign sent by the Client is consistent with the new_sign calculated by the Server itself. It looks like this:

new_sign = MD5(str(actions + 10086) + str(tim) + randstr * 3)
if new_sign == sign:
	print("It's not fake news.")
else:
	print("This is false news.")
Copy the code

Of course, the interaction is only one-way, so the Server doesn’t have to open the door to the ocean and return the flow back to the Client (it’s possible to do that, but it’s not necessary for network communication). Even if other gang members intercept the message and tamper with it, it will not affect the Server, because other gang members do not know how sign is calculated, and the new_sign calculated by his tampered data is different from the received sign. Therefore, the Server can distinguish between real and fake messages and discard fake messages.

With the joint password, the Server that receives the message does not have to worry about getting a fake secret letter.

Signature verification is widely used. For example, when downloading an operating system image file, the official website provides the MD5 value of the file, and the sign value of the authentication part of the interfaces opened by enterprises such as Alibaba, Tencent, and Huawei.

This paper is adapted from section 3 of chapter 4 of “Python3 Anti-crawler Principle and Bypass Actual Combat”, which is the first book in the field of crawler that specially introduces anti-crawler. It describes the confrontation process of crawler technology and anti-crawler technology from two perspectives of “attack” and “defense”, and introduces in detail the principle and concrete implementation methods. Through this book, you will learn the causes and circumventions of anti-crawler techniques such as signature verification, text obfuscation, dynamic rendering, encryption and decryption, code obfuscation, and behavior captcha. The anti-crawler knowledge introduced in the book covers more than 90% of the anti-crawler means knowledge on the market, which is very hard core.

After mastering these knowledge, your theoretical foundation will be very solid, and you will be able to easily deal with the theory and thinking questions in the interview of senior crawler engineers in first-line large factories. In terms of actual practice, in addition to the 21 online practice examples set up in this book, I also need to conduct drills combined with the comprehensive anti-crawler encountered in my work, so as to steadily improve my technical strength.

⏰ this article USES the example of www.porters.vip/verify/sign… Is one of 21 examples of online exercises in the book.

More than 1,000 copies of the book were sold in the pre-sale phase. The first batch was sold out by engineers, and the second batch was immediately applied for printing, which shows the popularity of the book.

This book participated in jd book 50% discount activity, the current price is only 44.5 yuan, go to snap up!