Text: so-and-so white rice
Source: Python Technology [official ID: PYTHonAll]
Welcome to the wechat official account: Python Technology, here we have personally written 100 days of practical training, a variety of interesting programming practices, a variety of learning materials, and a large group of lovely friends to discuss with each other.
Download the B site video using Python
Site B, which has 1.72 monthly users, is a video download site for Pythonista. For some unknown reason, videos that have been added to your favorites sometimes fail to work
Analysis of the page
First of all, we open a video in B site (www.bilibili.com/video/BV1Vh… F12 analysis of a wave, in the figure below you can see that there are multiple m4S ending links, and the response type is video/ MP4
Open the panel to the Elements interface and find a javascript variable called window.playinfo. The content is similar to the URL shown in the figure above. It is an M4S link and the target is found
Get the title and link
Capture video page and BeautifulSoup module parse page, get video title and link (www.bilibili.com/video/BV17K…
def __init__(self, bv) :
# Video page address
self.url = 'https://www.bilibili.com/video/' + bv
# Download start time
self.start_time = time.time()
def get_vedio_info(self) :
try:
headers = {
'User-Agent': 'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari
}
response = requests.get(url = self.url, headers = headers)
if response.status_code == 200:
bs = BeautifulSoup(response.text, 'html.parser')
# Get the video title
video_title = bs.find('span', class_='tit').get_text()
# Get the video link
pattern = re.compile(r"window\.__playinfo__=(.*?) $", re.MULTILINE | re.DOTALL)
script = bs.find("script", text=pattern)
result = pattern.search(script.next).group(1)
temp = json.loads(result)
# Take the first video link
for item in temp['data'] ['dash'] ['video'] :if 'baseUrl' in item.keys():
video_url = item['baseUrl']
break
return {
'title': video_title,
'url': video_url
}
except requests.RequestException:
print('Video link error, please replace it')
Copy the code
Example results:
{
'title': '" Jay Chou's Love Song 2.0 "quietly recall the 20 years with Jay's company'.'url': 'http://cn-jszj-dx-v-06.bilivideo.com/upgcxcode/34/57/214635734/214635734_nb2-1-30080.m4s?expires=1595538100&platform=pc &ssig=Q5uom_rGdPasJhHBvna8tw&oi=3027480765&trid=347f5dc41e9647e2a6dce48286d0b478u&nfc=1&nfb=maPYqpoel5MI3qOUX6YpRA==&cdn Id = 2725 & mid = 0 & cip = 222.186.35.71 & orderid = 0, 3 & logo = 80000000 '
}
Copy the code
Download the video
Download the video using the URllib module’s urlRetrieve (URL, filename=None, reporthook=None) method, which can download remote data directly to the local
def download_video(self, video) :
title = re.sub(r'[\/:*?"<>|]'.The '-', video['title'])
url = video['url']
filename = title + '.mp4'
opener = urllib.request.build_opener()
opener.addheaders = [('Origin'.'https://www.bilibili.com'),
('Referer', self.url),
('User-Agent'.'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari)]
urllib.request.install_opener(opener)
urllib.request.urlretrieve(url = url, filename = filename)
Copy the code
Example results:
A video download is complete
The progress bar
There’s still a progress bar missing, and a download without a progress bar is a soulless download
def schedule(self, blocknum, blocksize, totalsize) :
Callback function of urllib. urlRetrieve :param blocknum: downloaded data block :param blocksize: size of data block :param totalsize: size of remote file :return:"
percent = 100.0 * blocknum * blocksize / totalsize
if percent > 100:
percent = 100
s = (The '#' * round(percent)).ljust(100.The '-')
sys.stdout.write("%.2f%%" % percent + '[' + s +'] ' + '\r')
sys.stdout.flush()
Copy the code
The sample results
Finally update the download video code to add the reporthook parameter
urllib.request.urlretrieve(url = url, filename = filename, reporthook = self.schedule)
Copy the code
conclusion
A simple B station video download tool to complete this, interested in the words of everyone can try to download B station panplay, it seems that the ordinary video is not the same