Programming dog programming bull technology sharing platform

Today, we bring you the selenium method of polar verification code cracking, is a little excited, friends can’t wait, let’s get right to the topic.

Tiger X net registration

This time, we take Tiger X as the starting point. When registering an account, we need to slide the picture to the gap position. We often encounter this verification code now, so we don’t need to introduce it in detail

For this verification code, we first decided to use Selenium simulation sliding cracking method. Selenium mouse movement, click and drag are relatively simple, so the problem is how far to drag, the eye looks intuitive, but how to obtain the program? Using image recognition…… Well, I can only think about it. Look at the source code of the web page or request information to see if there is any valid information.

Viewing Web Page Information

Right-click on the image to view the element

This moment of the picture, fortunately I more than 20 years of kylin arm did not practice in vain, we look at the elements to view what things

This may seem a little strange, but there’s a link to the image, and there’s location information, and there’s a lot of it, so just copy the link to the image into your browser and visit it

 

WTF, what the hell is this? Notice that six that looks like a pig’s tail? And the little arrow, compared to the full picture above, shows that if you move the arrow next to the little six, the pig’s tail works. Of course, if you look closely, there are other things like words that are similar. So we can confirm that this image has been scrambled, and if we can put it together, isn’t it closer to calculating the gap? Now we should notice that there is a lot of positional information behind the element view, which seems to have something to do with this disordering. Let’s make sure. Here’s what I think, since this position is relevant to the puzzle, and looking at our picture of the unicorn arm cut above, I’ll mark it again

When we click on the view element, the browser will highlight it for us. Originally I clicked on the image to view it, but in my opinion, shouldn’t it highlight the whole image? Looks as if he is not the way it is, only a small part so, with the element information on it, wide high class name, see figure 3 again, and location coordinates, should be in front of the x axis, followed by the y axis, y axis only 58 and 0, again according to the figure 2 a look, and image is divided into two parts, and then count the number of div, 26 pieces, each piece of width 10 x 58. According to this calculation, the width of the whole picture is 260 and the height is 116. Use the screenshot tool to pull the width and height of the picture, which is basically consistent

The next step is to decide how to spell it. I’m sorry to tell you that the pig is gone, but when I write here and check the page again, the picture has been refreshed. So the next screenshots may not be the same, so I’ll tell you in advance. Anyway, I’m just looking for feature points, and every picture should have them. Take a random feature point, look at the element, see where it is located in that DIV element, and then look at the following location. That’s pretty much it, so if we’re looking for images that have to do with location, we’d better pick something that’s obvious, like the middle, or the sides.

 

So this is sort of the middle, but it doesn’t matter if you look a little bit

I go, this…… Different from what I thought. Look for two more. Representative and strong

 

 

In order to prevent someone to say that I water words, the other two corners will not screenshots. At this point you might be wondering, why? You just said that the width of the picture is 260, why does the coordinate like 289 appear in the coordinate? It is beyond the standard, isn’t it? At first I also had this kind of confusion, maybe we see the image is smaller than it really is, maybe they left a border on the outside of the image, I thought so at first. But this coordinate is the coordinate of the image in the previous URL, and THEN I looked at figure 4

 

This is certainly a bigger picture, so the coordinate problem is solved, but what does this have to do with 260? The scrambled picture is bigger, the pieced picture is smaller, so how does it spell? Fortunately, we have a bit of useful information

See this -1px? It caught my attention, because in my mind, if you were to take a piece of the puzzle and make a complete picture, the picture on the left would start at (0,0), (0,58), but what we see is (1,0), (1,58), and the y value is pretty much what we expect, so the first part starts at 0, 58 high. Part two starts at 58. But there’s something wrong with the x value, if we start at 1, then the second one should be 11, because the width is 10, that’s certain, so let’s see

It’s 13. Is there an extra pixel in front of each piece? In that case it would have been 12, and in that case we kept looking for the rest, and by analyzing it we found that each piece plus 12 was the starting point for the next piece. So if you take one pixel off the left and one pixel off the right, you get a width of 10. And each of the pieces is 12, and the 26 pieces are 312, which is about the size of the puzzle that we saw, which means that our analysis is correct. Use the coordinates provided in the element to take the size of width 10. Now let’s analyze what these coordinates mean.

Coordinate analysis

Analyzing our screenshots of figures 9 through 12, let’s start with Figure 9. I thought x and y should be 0, or if not 0, they should be different numbers, but the y is 58, which counts as the second half of the image area, and x is 157, running to the middle of the field. Graph 11, your X should be around 300, y should be above 100, y is 0, to the top half, X is 205, behind the midfield, far from the goalkeeper. How fat is this? But we found that figure 9 is the first in the element, figure 11 was the last one in the element, coupled with the coordinate y values are all 58, in front of the back of the y values are all zero, in accordance with our first half period of reverse thinking, and then you see figure 9 respectively on the right/figure 11 on the left side of the element, and the order of the elements inside the div. That’s about it.

To sum up: The final picture is the jigsaw puzzle (FIG. 4), which is cut out according to x=157, Y =58, W =10 and H =58, and placed in the first position of the upper part; x=145, Y =58, W =10 and H =58 are cut out and placed in the second position of the upper part, next to the first, and so on, to form a whole picture.

 

That’s what I put together. Yeah, that’s good. That’s good, young man. But there seems to be something wrong with it. Take a close look at the page elements

Fullbg, cutbg, fullbg, cutbg, fullbg, fullbg, cutbg, fullbg

 

This time it works. Now the question becomes how do you calculate the location of the gap

Gap position

I thought there might be a way to calculate the different positions of two images. Baidu did a search, and then got Python to compare the difference between the two images. Then I found the interface imagechops. difference. Look carefully at the two pictures, there are other differences except the gap. Did you see the shadow behind the gap in Picture 16? It put a shadow on my heart. When I look at other pictures, they all have similar pictures. It’s easy to do in the back, but if it’s in the front, it’s in the shadow. If only there was a tolerance for this comparison, I used to use the button Sprite seems to have this, this is not smart ah. Since it is a contrast pixel, I just take the pixel to compare, and I don’t give it ==, give it a range, if the color difference is within this range, it is the same, so there is tolerance, isn’t it? This gap is usually very obvious and the shadows and background are blurry, so it should work. The idea is to get the width and height of the image and then walk through it pixel by pixel.

The color difference

How to determine this chromatic aberration? One way is debugging, which is more troublesome, and another way is to obtain multiple pictures, complete pictures and defect pictures, and then use the color tool to take the color value of the corresponding position to determine a general range. The distance is defined, and now we’re moving

Selenium simulated movement

There is a lot about Selenium on the simulation network, and here we just need to confirm which interfaces are required. ActionChains method:

  • Movetoelement (to_Element) – Moves the mouse over an element
  • Clickandhold (on_Element =None) – Click the left mouse button without releasing it
  • Movebyoffset (xoffset, yoffset) – Moves the mouse pointer from the current position to a coordinate
  • Release (on_Element = None) – Release the left mouse button on an element
  • Perform () – Performs the operation. It is important to remember that after calling the above method, perform must be performed to perform

I will not describe the operations of Selenium in detail, but only in simpler ways.

Principle analysis is over, this time must be pasted code, otherwise many people may not complete, but also conducive to everyone’s understanding.

(Due to limited space, the following is part of the code, the complete code please follow the public account “programming dog” and reply to “0419” to obtain)

Copy the code
  1. # -*- coding: utf-8 -*-
  2. import random
  3. import time, re
  4. from selenium import webdriver
  5. from selenium.common.exceptions import TimeoutException
  6. from selenium.webdriver.common.by import By
  7. from selenium.webdriver.support.wait import WebDriverWait
  8. from selenium.webdriver.support import expected_conditions as EC
  9. from selenium.webdriver.common.action_chains import ActionChains
  10. from PIL import Image
  11. import requests
  12. from io import BytesIO
  13. ` `
  14. class HuXiu(object):
  15.    def __init__(self):
  16.        chrome_option = webdriver.ChromeOptions()
  17.        # chrome_option.set_headless()
  18. ` `
  19.        self.driver = webdriver.Chrome(executable_path=r"/usr1/webdrivers/chromedriver", chrome_options=chrome_option)
  20.        self.driver.set_window_size(1440, 900)
  21. ` `
  22.    def visit_index(self):
  23.        self.driver.get("https://www.huxiu.com/")
  24. ` `
  25. WebDriverWait (self) driver, 10, 0.5). Until (EC) element_to_be_clickable ((By XPATH, '/ / * [@ class = "js - register"]')))
  26.        reg_element = self.driver.find_element_by_xpath('//*[@class="js-register"]')
  27.        reg_element.click()
  28. ` `
  29. WebDriverWait (self) driver, 10, 0.5). Until (EC) element_to_be_clickable (By XPATH, '//div[@class="gt_slider_knob gt_show"]')))
  30. ` `
  31. Enter the simulated drag process
  32.        self.analog_drag()
  33. ` `
  34.    def analog_drag(self):
  35. # Move the mouse over the drag button to display the drag picture
  36.        element = self.driver.find_element_by_xpath('//div[@class="gt_slider_knob gt_show"]')
  37.        ActionChains(self.driver).move_to_element(element).perform()
  38.        time.sleep(3)
  39. ` `
  40. # Refresh the polar image
  41.        element = self.driver.find_element_by_xpath('//a[@class="gt_refresh_button"]')
  42.        element.click()
  43.        time.sleep(1)
  44. ` `
  45. Get the image address and position coordinates list
  46.        cut_image_url, cut_location = self.get_image_url('//div[@class="gt_cut_bg_slice"]')
  47.        full_image_url, full_location = self.get_image_url('//div[@class="gt_cut_fullbg_slice"]')
  48. ` `
  49. # Mosaic images according to coordinates
  50.        cut_image = self.mosaic_image(cut_image_url, cut_location)
  51.        full_image = self.mosaic_image(full_image_url, full_location)
  52. ` `
  53. # Save pictures for easy viewing
  54.        cut_image.save("cut.jpg")
  55.        full_image.save("full.jpg")
  56. ` `
  57. # Calculate distance based on two images
  58.        distance = self.get_offset_distance(cut_image, full_image)
  59. ` `
  60. # Start moving
  61.        self.start_move(distance)
  62. ` `
  63. # if error occurs
  64.        try:
  65. WebDriverWait(self.driver, 5, 0.5). Until (ec.element_to_be_clickable ((by.xpath, '//div[@class="gt_ajax_tip gt_error"]')).
  66. Print (" validation failed ")
  67.            return
  68.        except TimeoutException as e:
  69.            pass
  70. ` `
  71. Check whether the verification is successful
  72.        try:
  73. WebDriverWait (self) driver, 10, 0.5). Until (EC) element_to_be_clickable (By XPATH, '//div[@class="gt_ajax_tip gt_success"]')))
  74.        except TimeoutException:
  75.            print("again times")
  76.            time.sleep(5)
  77. Drag recursively after failure
  78.            self.analog_drag()
  79.        else:
  80. Enter your mobile phone number and send the verification code
  81.            self.register()
  82. ` `
  83. Get a list of images and locations
  84.    def get_image_url(self, xpath):
  85. link = re.compile('background-image: url\("(.*?)"\); background-position: (.*?) px (.*?) px; ')
  86.        elements = self.driver.find_elements_by_xpath(xpath)
  87.        image_url = None
  88.        location = list()
  89.        for element in elements:
  90.            style = element.get_attribute("style")
  91.            groups = link.search(style)
  92.            url = groups[1]
  93.            x_pos = groups[2]
  94.            y_pos = groups[3]
  95.            location.append((int(x_pos), int(y_pos)))
  96.            image_url = url
  97.        return image_url, location
  98. ` `
  99. (Part of code)

This move moveByoffset, my previous Y value is also random [-5,5], I think this simulation is more realistic, always shake up and down, the result is because this consideration is too human, the recognition rate is very low, changed a lot of range, larger, smaller, the result is not offset, unexpectedly high recognition rate. It’s too fucking human to recognize. I’m drunk, too. Finally, send out the implementation effect

\

Author: Xingxing Online, an ape life from girl to reptile lover **

www.jianshu.com/u/680e0e38d…

To admire the authors’

Recent Hot articles

** How to be a slutty programmer in Python **

** New discovery of Using Python to crawl 100,000 reviews of Eason Chan’s new song “We” **

Introduction and implementation of machine learning algorithm KNN \

Interesting ways to unpack Python \

Using Python to analyze Apple stock price data \

Nginx+ UWSGi to deploy Django applications \

Python natural Language Processing analysis: How to kill dragons \

Python 3.6 enables individual bloggers to crawl blog posts, images, and comments


Please long press to scan the QR code and reply to “0419”



Click below to read the original article and become a community member for freeCopy the code