preface
The text and pictures in this article come from the network, only for learning, exchange, do not have any commercial purposes, copyright belongs to the original author, if you have any questions, please contact us to deal with
PS: If you need Python learning materials, please click on the link below to obtain them
Free Python learning materials and group communication solutions click to join
The target
Climb the cool dog music station song chart
! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/618a0b71127d4b7b8a815b36ac3adefe)
! [](https://p6-tt-ipv6.byteimg.com/origin/pgc-image/6dd1b24cd5324544b52147a05399f783)
The target address
https://www.kugou.com/yy/html/rank.html?from=homepage
Copy the code
The environment
Python3.6.5
pycharm
The crawler code
Transferred to tool
import requests
import re
import parsel
Copy the code
Request the website
headers = { 'authority': 'wwwapi.kugou.com', 'cookie': 'kg_mid=ac3836df72c523f46a85d8a5fd90fe59; kg_dfid=3ve7aQ2XyGmN0yE3uv3WcaHs; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d = 1600260110160312, 707; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; kg_mid_temp=ac3836df72c523f46a85d8a5fd90fe59; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1602312738', 'referer': 'https://www.kugou.com/song/', 'user-agent': 'the Mozilla / 5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36', } url = 'https://www.kugou.com/yy/html/rank.html' response = requests.get(url=url, headers=headers)Copy the code
Parsing website data
def func(url): response = requests.get(url=url, headers=headers) response.encode = response.apparent_encoding hashs = re.findall('"Hash":"(.*?) "', response.text, re.S) album_ids = re.findall('"album_id":(.*?) ,"', response.text, re.S) FileNames = re.findall('"FileName":"(.*?) "', response.text, re.S) data = zip(hashs, album_ids, FileNames) for i in data: hash = i[0] album_ids = i[1] FileName = i[2].encode('utf-8').decode('unicode_escape') # print(hash, album_ids, FileName) download_url = 'https://wwwapi.kugou.com/yy/index.php' params = { 'r': 'play/getdata', 'callback': 'jQuery19107150201841602037_1602314563329', 'hash': '{}'.format(hash), 'album_id': '{}'.format(album_ids), 'dfid': '3ve7aQ2XyGmN0yE3uv3WcaHs', 'mid': 'ac3836df72c523f46a85d8a5fd90fe59', 'platid': '4', '_': '1602312793005', } for i in html_data: page_url = i[0] name = i[1] print(page_url) func(page_url) Print (" = = = = = = = = = = = = = = = = = = = = = = = = = = is climbing a song take {} = = = = = = = = = = = = = = = = = = = = = = = = '. The format (name))Copy the code
Save the data
def download(url, title): Filename = 'save address' + title + '.mp3' response = requests. Get (url=url, headers=headers) mode='wb') as f: f.write(response.content) print(title)Copy the code
Run the code and the result is shown below
! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/04ac93fc45174b86b17673320930b514)
! [](https://p9-tt-ipv6.byteimg.com/origin/pgc-image/7c4dc2ea868b4adabde75b1b0fdcbcc2)
Did you learn?