preface
After a period of testing, I basically got familiar with the requirements, business, and other requirements, and then conducted various tests and joint investigations. In my spare time, I also learned Selenium, which was my tool as a crawler before. The Selenium library in Python is used for Coding
1. What is Selenium
Selenium is a tool for Web application testing. Selenium tests run directly in the browser, just as real users do. Supported browsers include Internet Explorer (7, 8, 9, 10, 11), Mozilla Firefox, Safari, Google, Chrome, Opera, and more. The main functions of this tool include: testing browser compatibility – testing your application to see if it works well on different browsers and operating systems. Test system functionality – Create regression tests to verify software functionality and user requirements. Supports automatic recording of actions and automatic generation of test scripts in.net, Java, Perl and other languages. (From Baidu, in short, can simulate the user to operate the browser, can support many browser drivers.)
2. Preparation
The Python libraries we need this time are Selenium and PyAutoGUI. Image source: Wallhaven. Cc /
pip install selenium
pip install pyautogui
Copy the code
Ps: Installing Selenium requires checking whether the browser version matches the driver version
3. Import related libraries
from selenium import webdriver
import time
import re
from selenium.webdriver.common.action_chains import ActionChains
import pyautogui
Copy the code
4, declare and call the browser, open the web page, automatic search
browser = webdriver.Chrome()
url = 'https://wallhaven.cc/'
browser.get(url)
Copy the code
find_element_by_name
find_element_by_id
find_element_by_xpath
find_element_by_link_text
find_element_by_partial_link_text
find_element_by_tag_name
find_element_by_class_name
find_element_by_css_selector
Copy the code
We first place the cursor over the element, then right-click to check (Chrome) to locate the appropriate code, then right-click to copy-copy XPATH (you can COPY other ways).
input_ = browser.find_element_by_id('search-text') Get the input box element
input_.send_keys('Makise Kurisu') Enter a name to search for
time.sleep(2) # sleep two seconds
button_ = browser.find_element_by_xpath('//*[@id="startpage-search"]/div/button') Get the search button element
button_.click() # click
Copy the code
The browser then automatically enters and searches for ~
5, to get a preview of each picture click the link, and automatically enter each picture to right-click – save
text = browser.page_source Get page information
pattern = re.compile(r'')
res = re.findall(pattern,text) # regular expression matching
Copy the code
The first step is to get a preview link for each image, which I’m going to do using regex, xpath, etc.
for i in res:
browser.get(i) # Enter the link
time.sleep(3)
pic = browser.find_element_by_xpath('//*[@id="wallpaper"]') # fetch element
action = ActionChains(browser).move_to_element(pic) # Move to the element
action.context_click(pic) # Right-click on the element
action.perform() # to perform
pyautogui.typewrite(['v']) # Press V to save
# Click save image and wait 1s to hit Enter
time.sleep(1)
pyautogui.typewrite(['enter'])
Copy the code
After that is to traverse each URL, right-click – save operation, specific can see the code ~ comment is very detailed.
PyAutoGUI is a pure Python GUI automation tool that automatically controls mouse and keyboard operations programmatically. I have tried many times with Selenium to automatically save by right-clicking, and it has worked well with this approach. (Please share if you have Selenium that can automatically right-click and enter.)
6, summary
So that’s officially the end, so let’s put all the code here
from selenium import webdriver
import time
import re
from selenium.webdriver.common.action_chains import ActionChains
import pyautogui
browser = webdriver.Chrome()
url = 'https://wallhaven.cc/'
browser.get(url)
input_ = browser.find_element_by_id('search-text') Get the input box element
input_.send_keys('Makise Kurisu') Enter a name to search for
time.sleep(2) # sleep two seconds
button_ = browser.find_element_by_xpath('//*[@id="startpage-search"]/div/button') Get the search button element
button_.click() # click
text = browser.page_source Get page information
pattern = re.compile(r'')
res = re.findall(pattern,text) # regular expression matching
for i in res:
browser.get(i)
time.sleep(3)
pic = browser.find_element_by_xpath('//*[@id="wallpaper"]')
action = ActionChains(browser).move_to_element(pic) # Move to the element
action.context_click(pic) # Right-click on the element
action.perform() # to perform
pyautogui.typewrite(['v']) # Press V to save
# Click save image and wait 1s to hit Enter
time.sleep(1)
pyautogui.typewrite(['enter'])
time.sleep(10)
browser.close()
Copy the code
Secretly, I may not have much contact with things during the internship, and sometimes I even feel a little boring, but I still need to constantly improve myself in the spare time to fulfill the needs, so that I continue to grow ~