preface

The text and pictures in this article come from the network, only for learning, communication, do not have any commercial purposes, if you have any questions, please contact us to deal with.

B station is a well-known video bullet screen website in China, with the most timely new animation,ACG atmosphere and the most creative Up host. The video data in the site is divided into video images and audio data.

Today I’m going to show you how to download and merge the videos from Station B.

Introduction to Python data analysis

https://www.bilibili.com/video/BV1LX4y1u7VA
Copy the code

Environment introduction:

  • Python 3.6
  • pycharm
  • requests
  • re
  • json
  • subprocess

Parse web pages

Target Page Analysis

The video and audio of station B are separated, and the audio URL and video URL are both in ****

Extract the data

1. Extract data by regular matching

2. The re extracts the data as a list, values it through the list, and extracts it

3. String to JSON data

4. Extract the VIDEO URL and audio URL by dictionary value

The crawler code

Import tool

Import requests import re # regular expression import pprint import JSON import subprocessCopy the code

Request header

Headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36'}Copy the code

The request data

def send_request(url):
    response = requests.get(url=url, headers=headers)
    return response
Copy the code

Parsing video data

Def get_video_data(html_data): """ """ """ "" <span class="tit">(.*?) Json_data = re.findall('<script>window\.__playinfo__=(.*?)) </script>', [0] # print(json_data) # json_data string json_data = json.loads(json_data) pprint Audio_url = json_data [' data '] [' dash '] [' audio '] [0] [' backupUrl] [0] print (' resolve to audio address: ' Video_url = json_data['data']['dash']['video'][0]['backupUrl'][0] print(' backupUrl :', video_url) video_data = [title, audio_url, video_url] return video_dataCopy the code

Save the data

def save_data(file_name, audio_url, video_url): Audio_data = send_request(audio_URL). Content print(' requesting video data ') video_data = send_request(video_url).content with open(file_name + '.mp3', mode='wb') as f: F. rite(audio_data) print(' saving audio data ') with open(file_name + '.mp4', mode='wb') as f: F.rite (video_data) print(' Saving video data ')Copy the code

Data consolidation

def merge_data(video_name): Print (' Start video composition :', video_name) # ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -strict experimental output.mp4 COMMAND = f'ffmpeg -i {video_name}.mp4 -i {video_name}.mp3 -c:v copy -c:a aac -strict experimental output.mp4' subprocess.Popen(COMMAND, Shell =True) print(' Finish composing :', video_name)Copy the code

rendering

Merge video and audio

The tool used here is < FFMPEG >, a set of open source computer programs for recording, converting, and streaming digital audio and video.

Download and unpack it, but you need to set the environment variables.

1, My computer, right mouse click on properties

2. Select Advanced System Settings

3. Select environment variables

4. Add environment variables, copy the file path, and select Create to add