First, write first

Hello everyone, I’m Charlie, as an office worker who faces computer every day.

Every day my wallpaper is Windows built-in sky blue, look really boring, interesting, boring ~

So, as we all know, I’m a blogger who loves quality wallpaper, and of course a whole bunch of high-quality wallpaper, no other meaning.

All right, no more beep beep, start today’s quality journey ~

Second, preparation

Arrange for all this

Python 3.6 Pycharm requests ParselCopy the code

3. Crawler process

1) Data source search:

1. Determine target requirements: crawl hd wallpaper pictures (other shore)

Use developer tools (F12 or right mouse click check) to find the url source of the image; Request the details page of the wallpaper to obtain its page source code to obtain the image URL address (one); The request list page gets the details page URL and title for each wallpaper.

2) Code implementation:

1. Send the request

Wallpaper list page url: www.netbian.com/1920×1080/i…

2. Get data

Page source/response.text Page text data

Parse the data

CSS xpath BS4 re Wallpaper details page URL: /desk/23397.htm 2 Wallpaper of the title

4. Save data

Save images are binary data

Grandpa: Is that it? Code? What do you mean the code won’t let you?

Don’t panic. It’s coming. It’s coming

Iv. Code display

I will not a disassembly, notes and the third step, I believe that you can understand the smart, it is not the last I put video explanation.

PIP install requests import parsel import time PIP install Parsel import time Time_1 = time.time() # for page in range(2, 12): Print (f '= = = = = = = = = = = = = = = = = = = = is climbing a pick up the first {page} page data content = = = = = = = = = = = = = = = = = = = =') url = F 'http://www.netbian.com/1920x1080/index_ {page}. HTM' # request header: the python code disguised as a browser to the server sends the request headers = {' the user-agent: 'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36'} Response = requests. Get (url=url, Headers =headers) # Apparent_encoding = Response.content.decode (' GBK ') Response.encoding = Response.apparent_encoding # Apparent_encoding # Retrieve source code/retrieve web page text data Response.text # print(response.text) # parse data selector = parsel.selector (response.text) # CSS Lis = selector. CSS ('.list li') for li in lis: # http://www.netbian.com/desk/23397.htm title = li.css('b::text').get() if title: href = 'http://www.netbian.com' + li.css('a::attr(href)').get() response_1 = requests.get(url=href, headers=headers) selector_1 = parsel.Selector(response_1.text) img_url = selector_1.css('.pic img::attr(src)').get() img_content = requests.get(url=img_url, headers=headers).content with open('img\\' + title + '.jpg', mode='wb') as f: F.write (img_content) print(' save: ', title) time_2 = time.time() use_time = int(time_2) -int (time_1) print(f' use_time} seconds ')Copy the code

You can run it yourself, remember three wow

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

High quality “crawler” of course climb a “high quality” wallpaper

First, write first

Second, preparation

3. Crawler process

1) Data source search:

1. Determine target requirements: crawl hd wallpaper pictures (other shore)

2) Code implementation:

1. Send the request

2. Get data

Parse the data

4. Save data

Iv. Code display

High quality “crawler” of course climb a “high quality” wallpaper

First, write first

Second, preparation

1) Data source search:

2) Code implementation:

2. Get data

4. Save data

Related Posts