Using Node to download Acfun videos – parsing A site video API, Aria2C multi-threaded download and FFMPEG merge
This article introduces how to use Node to download Acfun video. The main process includes parsing A station video API, calling ARIa2C multithreading to download M3U8 fragments and calling FFMPEG to merge videos.
use
The system needs to install aria2C (deb package named aria2) and FFMPEG first
sudo apt install aria2 ffmpeg
Copy the code
Install the project NPM package ACfun-video-cli globally, or clone the project Czzonet/ACfun-video-cli compilation and run
yarn global add acfun-video-cli
Copy the code
Download with video URL, download content in the current directory in a new download folder, including fragmented and composite video files.
acfun-video-cli https://www.acfun.cn/v/ac4621380
Copy the code
Results :(omitted part with… Said)
------ Parse url ok ------ Parse m3u8 1080p: https://tx-safety-video.acfun.cn/mediacloud/acfun/acfun_video/hls/3HlXWWGOvsJ3D9Vhsn2QzbWPzp9OwtD40Yk9bk8v9t7Khv6leh44hG nw-Qqx9_KP.m3u8?pkey=AALA88Sf3Prmclff8_Ki5E0wlxj0Gam0_NN5bLvhUbCS2_88ypokmdH2Kf1wvzojL4pZJVjDn2m_iRkcrw-4hhRYEn5x01YOyfx YlJ9oOmeMtw4QA_UMZFq5MHQMp7BQZOkIFPPc7oBI0ABtWSSihiKp9WkKUklJibYCStx4Ego_u8MlOMaHONKAAivGjrCsrZap0sO3nuqV5-pThp_LE_WyXIm XmfUSFbBkT3vLCWujKw&safety_id=AALdip3SIjwDZfuuv1y8iHA4 ok ------ Download ts videos 09/09 15:53:10 [NOTICE] Downloading 29 item(s) ... Status Legend: (OK): Download completed. OK ------ Parse URL FFmpeg Version 4.2.4-1Ubuntu0.1 Copyright (c) 2000-2020 The FFmpeg developers ... Stream mapping: Stream#0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for helpFrame = 3000 FPS = 0.0q =-1.0 size= 28160kB time=00:02:01.35 bitrate=1901.0kbits/s speed= 238x frame= 3520fps = 0.0q =-1.0 Lsize= 33639kB time=00:02:22.36 bitrate=1935.7kbits/s speed= 243x video:31340kB audio:2223kB subtitle:0kB other Headers :0kB muxing overhead: 0.227712% OKCopy the code
The development of
Initialize the
Create a new TS project and add an entry file named index.ts.
A new index. Ts
async function main() {}
main().then();
Copy the code
Open the terminal and install node’s built-in module type declarations, which will use the Node Api to request network and file operations, etc
yarn add -D @types/node
Copy the code
Explanation:
-D
: Type declarations are only used in development environments
Parse the VIDEO API of station A
Parse the VIDEO API of station A to obtain the M3U8 file address with different definition of the video, and request the video URL similar to:
https://www.acfun.cn/v/ac4621380?quickViewId=videoInfo_new&ajaxpipe=1
Copy the code
Explanation:
https://www.acfun.cn/v/ac4621380
: Original video address? quickViewId=videoInfo_new&ajaxpipe=1
: Fixed the request parameters
Return a messy HTML file, extract the JSON part and parse it into a videoInfo object.
{" HTML ":" <script class= "videoInfo ">\n window.pageInfo = window.videoInfo ={\"currentVideoId\":6291551,... ,\"priority\":0} </script> <script class=\"videoResource\">\n window.videoResource ={}</script> <div Class = "left - the column '> \ n \ n...Copy the code
Resolution:
- The JSON inside the Script tag is the details of the video, cut out and parsed into a videoInfo object.
- Json is used
\
Escaped, there is also a nested layer of JSON was escaped again, so the corresponding to escape processing, the\ \"
.\"
Replace with"
.
VideoInfo Partial interface format
videoInfo
currentVideoInfo
ksPlayJson
: Plays the JSON string of information, such as the address, and parses itksPlay
object
KsPlay partial interface format
ksPlay
adaptationSet[0]
: may be reserved for different versions, originally an object, now an array of elements.representation
: An array of objects of different sharpness, in descending sharpness.url
: m3u8 linkqualityType
: clarity, such as 1080p, 720p
The new API. Ts
import * as https from "https";
export async function getUrlData(url: string) :Promise<string> {
return new Promise((resolve, reject) = > {
https
.get(url, (res) = > {
if (res === null) {
reject(new Error("[E] No Response."));
}
const { statusCode } = res;
const contentType = res.headers["content-type"];
const allowTypes = [
"application/json; charset=utf-8"."application/octet-stream"."application/vnd.apple.mpegurl",];let error;
if(statusCode ! = =200) {
error = new Error("[E] Response code: " + statusCode);
} else if(! (contentType ! = =undefined && allowTypes.includes(contentType))
) {
error = new Error(
"[E] Invalid content-type.\n" +
`Expected one of ${allowTypes} but received ${contentType}`
);
}
if (error) {
res.resume();
reject(error);
}
res.setEncoding("utf8");
let rawData = "";
res.on("data".(chunk) = > (rawData += chunk));
res.on("close".() = > {
resolve(rawData);
});
})
.on("error".(error) = > {
reject(new Error("[E] Https.Get error: " + error));
});
});
}
Copy the code
Explanation:
A simple Node native request wrapped asynchronously to retrieve network data.
- The introduction of
https
package - Pass in the parameter URL
- Encapsulate the callback function with Promise
- Making a GET request
- Check response for error handling
- Check response status code and response format for error handling. If there is a mistake
res.resume()
To consume the response data and clear the memory footprint. - Set the response to utF8 encoding
- Listen to receive and concatenate data
- The listener closes and returns data
- Get request error handling
The new parser. Ts
import { getUrlData } from "./api";
export async function parseUrl(videoUrlAddress: string) {
// eg https://www.acfun.cn/v/ac4621380?quickViewId=videoInfo_new&ajaxpipe=1
const urlSuffix = "? quickViewId=videoInfo_new&ajaxpipe=1";
const url = videoUrlAddress + urlSuffix;
const raw: string = await getUrlData(url);
// Split
const strsRemoveHeader = raw.split("window.pageInfo = window.videoInfo =");
const strsRemoveTail = strsRemoveHeader[1].split("</script>");
const strJson = strsRemoveTail[0];
const strJsonEscaped = escapeSpecialChars(strJson);
/** Object videoInfo */
const videoInfo = JSON.parse(strJsonEscaped);
const ksPlayJson = videoInfo.currentVideoInfo.ksPlayJson;
/** Object ksPlay */
const ksPlay = JSON.parse(ksPlayJson);
const representations: any[] = ksPlay.adaptationSet[0].representation;
const urlM3u8s: string[] = representations.map((d) = > d.url);
return urlM3u8s;
}
/** * remove some JSON escapes \\" -> \" ->" *@param str* /
function escapeSpecialChars(str: string) {
return str.replace(/\\\\"/g.'\ \ "').replace(/\\"/g.'"');
}
Copy the code
Resolution:
- The introduction of
getUrlData
Function to access network data - Accepts the parameter video address URL
- Add the request URL suffix
- Request raw data
- Cut the head and cut the tail and get
videoInfoJson
- Unescape THE Json string
- Parsing Json to get
videoInfo
object - extract
videoInfo
objectksPlayJson
Property, obtained by parsingksPlay
object - extract
ksPlay
objectrepresentations
Property to get an array of different definitions - Returns an array of links in different articulations
Modify index.ts and add to main:
const url = `https://www.acfun.cn/v/ac4621380`;
console.log("\n------\nParse url");
const m3u8Urls = await parseUrl(url);
console.log("ok");
Copy the code
Explanation:
- Example Video address
- call
parseUrl
Function parses the m3uu8 play address for all articulation.
Call aria2C multithreading to download m3U8 sharding
First, choose a high-definition M3U8 link download, such as 1080p, then download and parse the M3U8 file, extract the download link array, and finally call aria2C multithreaded download.
Parse the M3U8 file
Modify parser.ts and add:
export async function parseM3u8(m3u8Url: string) {
const m3u8File = await getUrlData(m3u8Url);
/** Separate the ts file link */
const rawPieces = m3u8File.split(/\n#EXTINF:.{8},\n/);
/** Filter header */
const m3u8RelativeLinks = rawPieces.slice(1);
/** modify tail to remove redundant tail terminator */
const patchedTail = m3u8RelativeLinks[m3u8RelativeLinks.length - 1].split(
"\n") [0];
m3u8RelativeLinks[m3u8RelativeLinks.length - 1] = patchedTail;
/** full link, directly add the m3u8Url common prefix */
const m3u8Prefix = m3u8Url.split("/").slice(0, -1).join("/");
const m3u8FullUrls = m3u8RelativeLinks.map((d) = > m3u8Prefix + d);
/** aria2c (); /** aria2c (); After that, the URL argument) */
const tsNames = m3u8RelativeLinks.map((d) = > d.split("?") [0]);
/** Specifies the folder name. Delete the fragment number */ at the end of the file name
let outputFolderName = tsNames[0].slice(0, -9);
/** Outputs the last merged file name, with a universal mp4 suffix */
const outputFileName = outputFolderName + ".mp4";
return {
m3u8FullUrls,
tsNames,
outputFolderName,
outputFileName,
};
}
Copy the code
Explanation:
- Accepts parameter m3u8 file address URL
- Download the M3U8 file
- Detach the TS file link
- Remove excess head
- Modify tail to remove superfluous tail terminators
- Get an array of relative addresses
- Generates full address array, need to add
m3u8Url
Same prefix - generate
aria2c
Download the file name, which is the last part of the URL, remove the last url parameter (? After that, the URL parameter) - Generate the folder name, remove the shard number at the end of the file name
- Generate the last merged file name with the universal MP4 suffix
- Returns the generated information
Call aria2c multithreaded download
Install aria2c
sudo apt install aria2
Copy the code
New runShell. Ts
import { spawn } from "child_process";
/** * Run the shell command *@param Command Specifies the shell * to execute@param Args shell argument *@param Options shell Options *@description Example: ` ` ` ts readUpdateOutputFromShell (" SAR ", [" -n ", "DEV", "1"]) ` ` ` * /
export const runShell = async (
command: string.args: readonly string[].options: ShellOption
) =>
new Promise((resolve, reject) = > {
const runpProcess = spawn(command, args, {
stdio: "inherit".cwd: options.cwd ? options.cwd : process.cwd(),
env: process.env,
detached: true});/** End processing */
runpProcess.on("close".(code) = > {
resolve();
});
});
typeShellOption = { cwd? :string;
};
Copy the code
Explanation:
A Node calls a child thread to execute a simple asynchronous encapsulation of system commands
- The introduction of
spawn
function stdio: "inherit"
: Directly use the parent thread’s stdio, stderr.cwd: options.cwd
: Command execution path Set this parameter to the default path
A new video. Ts
import * as fs from "fs";
import * as path from "path";
import { runShell } from "./runShell";
export async function downloadM3u8Videos(
m3u8FullUrls: string[],
outputFolderName: string
) {
// Check the download folder name
if (outputFolderName == "") {
throw new Error("[E] Download folder name is empty.");
}
/** The folder with the same name already exists and needs to be suffix _ to avoid conflicts */
while (fs.existsSync(path.resolve(process.cwd(), outputFolderName))) {
outputFolderName += "_";
if (outputFolderName.length > 100) {
throw new Error(
"[E] Download folder exists and try to rename too many times."); }}/** Create a new download folder */ in the current running directory
const outPath = path.resolve(process.cwd(), outputFolderName);
fs.mkdirSync(outPath);
/** Write the download link list file */
fs.writeFileSync(path.resolve(outPath, "urls.txt"), m3u8FullUrls.join("\n"));
/** aria2c multithreaded download */
await runShell("aria2c"["-i"."./urls.txt"] and {cwd: path.resolve(outPath),
});
}
Copy the code
Explanation:
- Check download folder name, empty error
- A folder with the same name already exists. Add the suffix _ to avoid conflicts
- Create a download folder, folder creation in the current running directory
- Write the download link list file
urls.txt
aria2c
readurls.txt
Multithreaded downloads to the download folder, and the number of downloads can be specified by the -j argument (default 5)
Add index. Ts to main
console.log("\n------\nParse m3u8");
const m3u8Url1080p = m3u8Urls[0];
const info = await parseM3u8(m3u8Url1080p);
console.log("ok");
console.log("\n------\nDownload ts videos");
const { m3u8FullUrls, tsNames, outputFolderName, outputFileName } = info;
await downloadM3u8Videos(m3u8FullUrls, outputFolderName);
console.log("ok");
Copy the code
Explanation:
- Pick a link in 1080p clarity and parse the information needed for the download
- Download all TS video files link to the corresponding download folder
Call FFMPEG to merge videos
Install ffmpeg
sudo apt install ffmpeg
Copy the code
Ffmpeg merges videos using the following instructions: Refer to the official wikiConcatenate – FFmpeg
ffmpeg -f concat -safe 0 -i ./files.txt -c copy outputFileName
Copy the code
Explanation:
ffmpeg
: Uses FFMPEG to process video.-f concat
: Mandatory format Uses the virtual connection script decoder concat, which is used to read files of the same format, merge input and concatenate.-safe 0
: Ignores the file path security check.-i ./files.txt
: Input file containing multiple videos and parameters specified in the following format.-c copy
: Copies video and audio streams without processing.outputFileName
: The name of the output file.
TXT format is as follows
file /path/xxx1
file /path/xxx2
...
Copy the code
Explanation:
- Specifies the file path to read, multiple files multiple lines split.
Edit Video.ts and add
export async function mergeVideo(
tsNames: string[],
outputFileName: string,
outputFolderName: string
) {
const outPath = path.resolve(process.cwd(), outputFolderName);
/** Merge parameter list format file path */
const concatStrs = tsNames.map((d) = > `file '${outPath}/${d}'`);
/** Writes the merge parameter list file */
fs.writeFileSync(path.resolve(outPath, "files.txt"), concatStrs.join("\n"));
debugger;
/** ffmpeg merge */
await runShell(
"ffmpeg"["-f"."concat"."-safe"."0"."-i"."./files.txt"."-c"."copy",
outputFileName,
],
{ cwd: path.resolve(outPath) }
);
}
Copy the code
Explanation:
- Parameter: array of ts video file names, the file names of the merged output, the name of the merged folder
- Gets the path to the download folder
- Merge parameter list format
file path
- Write to the merge parameter list file
ffmpeg
merge
Edit index.ts and add to main:
console.log("\n------\nMerge video");
await mergeVideo(tsNames, outputFileName, outputFolderName);
console.log("ok");
Copy the code
Run.
Change to user input URL
Modified index. Ts:
const url = process.argv[2];
console.log("Your input: ", url);
if (typeofurl ! = ="string") {
console.log("[E] Url input required.");
return;
}
if (url.match(/^https:\/\/www\.acfun\.cn\/v\/ac\d+$/) = = =null) {
console.log(
"[E] Url input invalid.Valid input example: https://www.acfun.cn/v/ac4621380"
);
return;
}
Copy the code
Explanation:
- Read the user input parameter URL
- If there is no input, an error is returned
- If the input format does not match, an error is returned
Final index.ts:
import { parseUrl, parseM3u8 } from "./parser";
import { downloadM3u8Videos, mergeVideo } from "./video";
async function main() {
const url = process.argv[2];
console.log("Your input: ", url);
if (typeofurl ! = ="string") {
console.log("[E] Url input required.");
return;
}
if (url.match(/^https:\/\/www\.acfun\.cn\/v\/ac\d+$/) = = =null) {
console.log(
"[E] Url input invalid.Valid input example: https://www.acfun.cn/v/ac4621380"
);
return;
}
console.log("\n------\nParse url");
const m3u8Urls = await parseUrl(url);
console.log("ok");
console.log("\n------\nParse m3u8");
const m3u8Url1080p = m3u8Urls[0];
console.log("[1080p] ", m3u8Url1080p);
const info = await parseM3u8(m3u8Url1080p);
console.log("ok");
console.log("\n------\nDownload ts videos");
const { m3u8FullUrls, tsNames, outputFolderName, outputFileName } = info;
await downloadM3u8Videos(m3u8FullUrls, outputFolderName);
console.log("ok");
console.log("\n------\nMerge video");
await mergeVideo(tsNames, outputFileName, outputFolderName);
console.log("ok");
}
main().then();
Copy the code
At this point, the process is complete.
conclusion
This paper introduces the analysis of A station video API step by step, call aria2C multithreading download M3U8 fragment and call FFMPEG merge video, realize the use of Node to download Acfun video function.
Alternatively, FFMpeg itself can be single-threaded directly to download m3U8 links for merging, using the command:
ffmpeg -i 'https://xxx.m3u8' -c copy output.mp4
Copy the code
Convenient diagram simple children’s shoes to use.
If you don’t like this document, click the close button in the upper right corner. If you liked it and found it useful, feel free to like it and share your thoughts in the comments. Project source code, stamp ~czzonet/ ACfun -video- CLI
reference
- ffmpeg Documentation
- FFmpeg Formats Documentation
- Concatenate – FFmpeg
- Aria2c (1) — ARIA2 1.35.0 documentation
- HTTP | Node. Js v14.9.0 Documentation
Legal issues
The author assumes no responsibility if your use of the Software constitutes the basis for copyright infringement or if you use the Software for any other illegal purposes.
This software is distributed under the Apache-2.0 license.
In particular, please be aware that
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Translated to human words:
In case your use of the software forms the basis of copyright infringement, or you use the software for any other illegal purposes, the authors cannot take any responsibility for you.
We only ship the code here, and how you are going to use it is left to your own discretion.