This article is participating in Python Theme Month. See the link to the event for more details

Hello everyone, I am Brother Chen ~

I believe that everyone has contacted the short video platform, such as a sound, a hand and other platforms, unexpectedly everyone is familiar with, so today Chen brother share technology is: search video in a hand, and achieve download!

01 Obtain the search link

Anyone who has written an interface or developed a website knows that when a request is made for a resource on a server, the server returns data in response to the request by accessing the link (interface).

1. Search for request links

Therefore, the first step is to obtain the search request link. Here, Chen Ge obtains the link by capturing the data packet.

If you are not familiar with this technical operation, you can refer to my previous article (taking [a Trip] as an example to describe small program crawler technology), which explains how to use MitmProxy to collect small programs from 0 to 1.

For example, search: ballad, check the packet on the packet capture page, and find the following packet

Click on the packet

You can see that the request for the search link is in POST mode, as well as the request headers and request parameters. The keyword in the request parameter is the search keyword, and you can get different content by modifying the keyword.

2. Analyze data packets

By looking at the returned data, you can see that all the video content is in the field feeds

Extract fields: video address, user name, cover image, video name

mp4_url = i['mainMvUrls'][0]['url']
userName = i['userName']
pic_url = i['coverUrls'][0]['url']
caption = i['caption']

Copy the code

02 Requesting Data

Now that we know the packet’s request mode and parameters, and the data returned, we can begin to construct the request and process the response data in Python.

Request header and request parameters

headers = {
    'content-type':'application/json'.'cookie':'Own cookie'
}
s = json.dumps({
    "keyword": "Folk songs"."pcursor": ""."ussid": ""
})

Copy the code

Request address:

url = 'https://wxmini-api.uyouqu.com/rest/wd/search/feed'
r = requests.post(url, data=s,headers=headers).json()

Copy the code

Printout result:

03 Saving Data

We’re going to save the video and the corresponding cover image download locally, and we’re going to create two new functions here, one to download the video and one to download the cover image.

Download the video

Def download_mp4(mp4_name, mp4_URL): STR (time.strftime('%y%m%d', time.localtime())) dir_path = "/"+dir Os.mkdir (dir_path) headers = {' user-agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'} r = requesting. Get (mp4_URL, Headers =headers, stream=True) if r.stutus_code == 200: # with open(dir_path+"/"+mp4_name+".mp4", 'wb') f.write(r.content)Copy the code

Download cover Art

Def def img(img_name,img_url): STR (time.strftime('%y%m%d', time.localtime())) dir_path = "/"+dir Os.mkdir (dir_path) headers = {' user-agent ': 'Mozilla/5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'} r = requesting. Get (img_url,) Headers =headers, stream=True) if r.status_code == 200: f.write(r.content)Copy the code

Both the video and cover art are saved to a folder named the date of the day, or automatically created if there is no such folder.

Call these two functions

Start with caption pic_URL and download the video with caption mp4_URL.Copy the code

After the execution, save the result:

You can see the cover image and video are saved successfully! The name is named after the video.

04 summary

This article explains a hand search video download technology, for beginners to learn or a good practice crawler, want to learn partners, must try ****! Be sure to try ****! Do try it!

This article source address: gitee.com/lyc96/kuais…