Verification code can be said to be a more troublesome crawler technology! Today, you'll learn how to handle the slider captcha!

Do a crawler will always encounter a variety of anti-climb restrictions, anti-climb the first line of defense is often in the login, in order to limit the automatic login crawler, they use all their efforts, the so-called high one foot devil high ten post.

Today I will share a simple example of how to handle the captcha for sliding pictures.

Login verification, such as dragging the slider to coincide with the gap in the image, is common on many websites and apps because it is user-friendly to real users and easy to identify. It also intercepts most junior crawlers.

As a Python crawler, how do you properly automate this validation process?

First, the core issue is how to find the location of the target gap. Once we know the location, we can use tools like Selenium to complete the dragging operation.

We can use OpencV to solve this problem. The main steps are:

What is OpencV?

OpenCV (Open Source Computer Vision Library) is an Open Source Computer Vision Library. The main algorithms are related to image processing, Computer Vision and machine learning methods, which can be used to develop real-time image processing, Computer Vision and pattern recognition programs.

Installed directly

Firstly, the image is processed by Gaussian blur, the main function of Gaussian blur is to reduce the noise of the image, which is used in the pre-processing stage.

The effect after treatment

Then Canny edge detection is used to obtain a binary image containing a “narrow boundary”. The so-called binary image is black and white, only black and white.

Contour detection

Find out all the contours, and draw it out with red line box, see there are dozens of contours, big and small

The rest of the problem is easy, we only need to limit the area or perimeter of the contour, and then we can filter out the location of the target contour, provided that the size of the target contour is predetermined. The area of the contour is between 6000 and 8000, and the circumference is between 300 and 500. Finally, the outer rectangle is used to obtain the coordinate position, width and height of the contour. Now that we have found the target position, all that remains is to move the slider to the desired position

Recently, many friends have asked about Learning Python through private messages. For easy communication, click on blue to add yourselfDiscuss solutions to resource bases

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Verification code can be said to be a more troublesome crawler technology! Today, you’ll learn how to handle the slider captcha!

Verification code can be said to be a more troublesome crawler technology! Today, you’ll learn how to handle the slider captcha!

Related Posts

Some common Java programming tips

Home-based entrepreneurship – #12 Focus and Productivity issues

How does Linux protect kernel code? [From solid mode to Protection Mode]