preface

The text and pictures in this article come from the network, only for learning, exchange, do not have any commercial purposes, copyright belongs to the original author, if you have any questions, please contact us to deal with

PS: If you need Python learning materials, please click on the link below to obtain them

Free Python learning materials and group communication solutions click to join

The target

Climb the cool dog music station song chart

! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/618a0b71127d4b7b8a815b36ac3adefe)
! [](https://p6-tt-ipv6.byteimg.com/origin/pgc-image/6dd1b24cd5324544b52147a05399f783)

The target address

https://www.kugou.com/yy/html/rank.html?from=homepage
Copy the code

The environment

Python3.6.5

pycharm

The crawler code

Transferred to tool

import requests
import re
import parsel
Copy the code

Request the website

headers = { 'authority': 'wwwapi.kugou.com', 'cookie': 'kg_mid=ac3836df72c523f46a85d8a5fd90fe59; kg_dfid=3ve7aQ2XyGmN0yE3uv3WcaHs; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d = 1600260110160312, 707; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; kg_mid_temp=ac3836df72c523f46a85d8a5fd90fe59; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1602312738', 'referer': 'https://www.kugou.com/song/', 'user-agent': 'the Mozilla / 5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36', } url = 'https://www.kugou.com/yy/html/rank.html' response = requests.get(url=url, headers=headers)Copy the code

Parsing website data

def func(url): response = requests.get(url=url, headers=headers) response.encode = response.apparent_encoding hashs = re.findall('"Hash":"(.*?) "', response.text, re.S) album_ids = re.findall('"album_id":(.*?) ,"', response.text, re.S) FileNames = re.findall('"FileName":"(.*?) "', response.text, re.S) data = zip(hashs, album_ids, FileNames) for i in data: hash = i[0] album_ids = i[1] FileName = i[2].encode('utf-8').decode('unicode_escape') # print(hash, album_ids, FileName) download_url = 'https://wwwapi.kugou.com/yy/index.php' params = { 'r': 'play/getdata', 'callback': 'jQuery19107150201841602037_1602314563329', 'hash': '{}'.format(hash), 'album_id': '{}'.format(album_ids), 'dfid': '3ve7aQ2XyGmN0yE3uv3WcaHs', 'mid': 'ac3836df72c523f46a85d8a5fd90fe59', 'platid': '4', '_': '1602312793005', } for i in html_data: page_url = i[0] name = i[1] print(page_url) func(page_url) Print (" = = = = = = = = = = = = = = = = = = = = = = = = = = is climbing a song take {} = = = = = = = = = = = = = = = = = = = = = = = = '. The format (name))Copy the code

Save the data

def download(url, title): Filename = 'save address' + title + '.mp3' response = requests. Get (url=url, headers=headers) mode='wb') as f: f.write(response.content) print(title)Copy the code

Run the code and the result is shown below

! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/04ac93fc45174b86b17673320930b514)
! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/7c4dc2ea868b4adabde75b1b0fdcbcc2)

Did you learn?