Review past
Python automatically monitors Github projects and opens web pages
Python implements automatic file categorization
The Python implementation helps you select a bicolor ball number
Python to change bing images to Desktop wallpaper every day
Python implements batch watermarking
The Python implementation decodes the zip package
preface
Today we will use Python script to achieve batch download Baidu pictures. Directly open the whole ~
Results show
Writing ideas:
1. Get the URL of the image
First of all, open the home page of Baidu pictures and pay attention to the index in the URL below
Next, switch the page to the traditional Flip, because it makes it easier to crawl!
Then, right-click the web source and search objURL directly (CTRL +F)
So here we have the URL we need the image for.
2. Save the image link to a local directory
Now, all we have to do is crawl that information out.
Note: objURL, hoverURL… But we’re using objURL, because this is the original
The regular expression gets the objURL
results = re.findall('"objURL":"(.*?) ",", html)
Copy the code
Source code display:
1. Get image URL code:
Get the image URL link
for i in range(int(pn)):
# 1. Get the web page
print('Getting page {}'.format(i+1))
# Baidu picture home PAGE URL
# name is the keyword you want to search for
# pn is the number of pages you want to download
url = 'https://image.baidu.com/search/flip?tn=baiduimage&ie=utf-8&word=%s&pn=%d' %(name,i*20)
headers = {
'User-Agent': 'the Mozilla / 5.0 (Windows NT 10.0; WOW64) AppleWebKit / 537.36 (KHTML, Like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4843.400 QQBrowser/9.7.13021.400'}
Send a request to get the corresponding message
response = requests.get(url, headers=headers)
html = response.content.decode()
# print(html)
# 2. Regular expression parsing of web pages
# "objURL":"http://n.sinaimg.cn/sports/transform/20170406/dHEk-fycxmks5842687.jpg"
results = re.findall('"objURL":"(.*?) ",", html) # return a list
# Save the image locally according to the link obtained
save_to_txt(results, name, i)</pre>
Copy the code
2. Save images to local code:
# Save the image locally
j = 0
# create a folder under the current directory
if not os.path.exists('/' + name):
os.makedirs('/' + name)
# Download image
for result in results:
print('Saving {} th'.format(j))
try:
pic = requests.get(result, timeout=10)
time.sleep(1)
except:
print('Current image cannot be downloaded')
j += 1
continue
# Ignorable, this code is buggy
# file_name = result.split('/')
# file_name = file_name[len(file_name) - 1]
# print(file_name)
#
# end = re.search('(.png|.jpg|.jpeg|.gif)/pre>, file_name)
# if end == None:
# file_name = file_name + '.jpg'
# Save the picture to a folder
file_full_name = '/' + name + '/' + str(i) + The '-' + str(j) + '.jpg'
with open(file_full_name, 'wb') asF: f.content (pic.content)1
Copy the code
Core code:
pic = requests.get(result, timeout=10) f.write(pic.content)
3. Main function code:
# main function
if __name__ == '__main__':
name = input('Please enter the keywords you want to download:')
pn = input('Would you like to download the first few pages (there are 60 on one page) :')
get_parse_page(pn, name)
Copy the code
That concludes this article. Thank you for watching. Next in this series of Python utility scripts, we will share the weather query application
To thank you readers, I’d like to share some of my recent programming favorites to give back to each and every one of you in the hope that they can help you.
Dry goods mainly include:
① Over 2000 Python ebooks (both mainstream and classic books should be available)
②Python Standard Library (Most Complete Chinese version)
③ project source code (forty or fifty interesting and classic practice projects and source code)
④Python basic introduction, crawler, Web development, big data analysis video (suitable for small white learning)
⑤ A Roadmap for Learning Python
⑥ Two days of Python crawler boot camp live access
All done ~ see personal profile or private letter for complete source code.