1. Find the home page of the song we want to crawl

2. Get songmid

3. Go to the song Play page to find the audio address

4. View the parameters for the audio address

5. How to find the vkey parameter

What parameters are required for the request address that returns the download address and the vkey parameter?

6. Now that we’ve figured out the encryption parameters, we can start writing code. Attention!!!!! : Web version of QQ Music it only shows a few pieces of music, all music needs to listen in the client.

from lxml import etree

import requests

import random

import json

headers = [

{

‘User-Agent’: ‘Mozilla/5.0 (Windows NT 6.1)” AppleWebKit/537.36 (KHTML, like Gecko) ”Chrome/86.0.4240.111 Safari/537.36’}

{

‘the user-agent: “Mozilla / 5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident / 5.0; The.net CLR 3.5.30729; The.net CLR 3.0.30729; The.net CLR 2.0.50727; Media Center PC (6.0) “},

{

‘the user-agent: “Mozilla / 5.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident / 4.0; WOW64; Trident / 4.0; SLCC2; The.net CLR 2.0.50727; The.net CLR 3.5.30729; The.net CLR 3.0.30729; The.net CLR 1.0.3705; The.net CLR 1.1.4322) “},

{

‘the user-agent: “Mozilla / 4.0 (compatible; MSIE 7.0 b; Windows NT 5.2; The.net CLR 1.1.4322; The.net CLR 2.0.50727; InfoPath.2; The.net CLR 3.0.04506.30) “},

{

‘the user-agent: “Mozilla / 5.0 (Windows; U; Windows NT 5.1; Zh-cn) AppleWebKit/523.15 (KHTML, like Gecko, Safari/419.3) Arora/0.3 (Change: 287 c9dfb30)”},

{‘ the user-agent: “Mozilla / 5.0 (X11; U; Linux; En-us) AppleWebKit/527+ (KHTML, like Gecko, Safari/419.3) Arora/0.6”},

{‘ the user-agent: “Mozilla / 5.0 (Windows; U; Windows NT 5.1; en-US; The rv: 1.8.1.2 pre) – Ninja Gecko / 20070215 K / 2.1.1 “},

{‘ the user-agent: “Mozilla / 5.0 (Windows; U; Windows NT 5.1; zh-CN; The rv: 1.9) Gecko Kapiko / 20080705 Firefox / 3.0/3.0 “},

{‘ the user-agent: “Mozilla / 5.0 (X11; Linux i686; U;) Gecko / 20070322 Kazehakase / 0.4.5 “}]

def get_songmid(url):

‘ ‘ ‘

Gets the song ID and name, singer

:param url:

:return:

‘ ‘ ‘

response = requests.get(url=url, headers=random.choice(headers), timeout=5).text

page_html = etree.HTML(response)

author = page_html.xpath(‘/html/body/div[2]/div[1]/div/div[1]/h1[2]/@title’)[0]

a_list = page_html.xpath(‘/html/body/div[2]/div[2]/div[1]/div[2]/ul[2]/li/div/div[3]/span/a’)

for a in a_list:

songmid = a.xpath(‘./@href’)[0][22:-5]

name = a.xpath(‘./@title’)[0]

get_vkey(songmid, name,author)

def get_vkey(songmid, name,author):

‘ ‘ ‘

For vkey

:param songmind:

:param name:

:return:

‘ ‘ ‘

data = {“req”: {“module”: “CDN.SrfCdnDispatchServer”, “method”: “GetCdnDispatch”,

“param”: {“guid”: “5831199011”, “calltype”: 0, “userip”: “”}},

“req_0”: {“module”: “vkey.GetVkeyServer”, “method”: “CgiGetVkey”,

“param”: {“guid”: “5831199011”, “songmid”: [songmid], “songtype”: [0],

“uin”: “2325794997”, “loginflag”: 1, “platform”: “20”}},

“comm”: {“uin”: 2325794997, “format”: “json”, “ct”: 24, “cv”: 0}}

Url_vkey_get = “u.y.qq.com/cgi-bin/mus…”

ret = requests.get(url=url_vkey_get+json.dumps(data),headers=random.choice(headers)).json()

Purl = ret[‘req_0’][‘data’][‘midurlinfo’][0][‘purl’]#

if purl:

download_music(purl,name,author)

def download_music(purl,name,author):

Ret = requests. Get (‘ http://106.120.158.153/amobile.music.tc.qq.com/ ‘+ purl, headers. = the random choice (headers). The content

with open(f'{name}-{author}.mp3′,’wb’) as f:

f.write(ret)

Print (f'{name}-{author}’,’ download done ‘)

if __name__ == ‘__main__’:

Url = ‘Zhao Lei – QQ Music – ten million copyrighted music massive lossless music library new songs hot songs every day listen to high quality music platform! ‘# address by artist

get_songmid(url)

7. Achievement demonstration

Next time I’ll teach you to crawl paid music!

Recently, many friends have sent messages to ask about learning Python. For easy communication, click on blue to join the discussion and answer resource base