Using Node to download Acfun videos – parsing A site video API, Aria2C multi-threaded download and FFMPEG merge

This article introduces how to use Node to download Acfun video. The main process includes parsing A station video API, calling ARIa2C multithreading to download M3U8 fragments and calling FFMPEG to merge videos.

use

The system needs to install aria2C (deb package named aria2) and FFMPEG first

sudo apt install aria2 ffmpeg
Copy the code

Install the project NPM package ACfun-video-cli globally, or clone the project Czzonet/ACfun-video-cli compilation and run

yarn global add acfun-video-cli
Copy the code

Download with video URL, download content in the current directory in a new download folder, including fragmented and composite video files.

acfun-video-cli https://www.acfun.cn/v/ac4621380
Copy the code

Results :(omitted part with… Said)

------ Parse url ok ------ Parse m3u8 1080p: https://tx-safety-video.acfun.cn/mediacloud/acfun/acfun_video/hls/3HlXWWGOvsJ3D9Vhsn2QzbWPzp9OwtD40Yk9bk8v9t7Khv6leh44hG nw-Qqx9_KP.m3u8?pkey=AALA88Sf3Prmclff8_Ki5E0wlxj0Gam0_NN5bLvhUbCS2_88ypokmdH2Kf1wvzojL4pZJVjDn2m_iRkcrw-4hhRYEn5x01YOyfx YlJ9oOmeMtw4QA_UMZFq5MHQMp7BQZOkIFPPc7oBI0ABtWSSihiKp9WkKUklJibYCStx4Ego_u8MlOMaHONKAAivGjrCsrZap0sO3nuqV5-pThp_LE_WyXIm XmfUSFbBkT3vLCWujKw&safety_id=AALdip3SIjwDZfuuv1y8iHA4 ok ------ Download ts videos 09/09 15:53:10 [NOTICE] Downloading 29 item(s) ... Status Legend: (OK): Download completed. OK ------ Parse URL FFmpeg Version 4.2.4-1Ubuntu0.1 Copyright (c) 2000-2020 The FFmpeg developers ... Stream mapping: Stream#0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for helpFrame = 3000 FPS = 0.0q =-1.0 size= 28160kB time=00:02:01.35 bitrate=1901.0kbits/s speed= 238x frame= 3520fps = 0.0q =-1.0 Lsize= 33639kB time=00:02:22.36 bitrate=1935.7kbits/s speed= 243x video:31340kB audio:2223kB subtitle:0kB other Headers :0kB muxing overhead: 0.227712% OKCopy the code

The development of

Initialize the

Create a new TS project and add an entry file named index.ts.

A new index. Ts

async function main() {}

main().then();
Copy the code

Open the terminal and install node’s built-in module type declarations, which will use the Node Api to request network and file operations, etc

yarn add -D @types/node
Copy the code

Explanation:

  • -D: Type declarations are only used in development environments

Parse the VIDEO API of station A

Parse the VIDEO API of station A to obtain the M3U8 file address with different definition of the video, and request the video URL similar to:

https://www.acfun.cn/v/ac4621380?quickViewId=videoInfo_new&ajaxpipe=1
Copy the code

Explanation:

  • https://www.acfun.cn/v/ac4621380: Original video address
  • ? quickViewId=videoInfo_new&ajaxpipe=1: Fixed the request parameters

Return a messy HTML file, extract the JSON part and parse it into a videoInfo object.

{" HTML ":" <script class= "videoInfo ">\n window.pageInfo = window.videoInfo ={\"currentVideoId\":6291551,... ,\"priority\":0} </script> <script class=\"videoResource\">\n window.videoResource ={}</script> <div Class = "left - the column '> \ n \ n...Copy the code

Resolution:

  • The JSON inside the Script tag is the details of the video, cut out and parsed into a videoInfo object.
  • Json is used\Escaped, there is also a nested layer of JSON was escaped again, so the corresponding to escape processing, the\ \".\"Replace with".

VideoInfo Partial interface format

  • videoInfo
    • currentVideoInfo
      • ksPlayJson: Plays the JSON string of information, such as the address, and parses itksPlayobject

KsPlay partial interface format

  • ksPlay
    • adaptationSet[0]: may be reserved for different versions, originally an object, now an array of elements.
      • representation: An array of objects of different sharpness, in descending sharpness.
        • url: m3u8 link
        • qualityType: clarity, such as 1080p, 720p

The new API. Ts

import * as https from "https";

export async function getUrlData(url: string) :Promise<string> {
  return new Promise((resolve, reject) = > {
    https
      .get(url, (res) = > {
        if (res === null) {
          reject(new Error("[E] No Response."));
        }

        const { statusCode } = res;
        const contentType = res.headers["content-type"];
        const allowTypes = [
          "application/json; charset=utf-8"."application/octet-stream"."application/vnd.apple.mpegurl",];let error;
        if(statusCode ! = =200) {
          error = new Error("[E] Response code: " + statusCode);
        } else if(! (contentType ! = =undefined && allowTypes.includes(contentType))
        ) {
          error = new Error(
            "[E] Invalid content-type.\n" +
              `Expected one of ${allowTypes} but received ${contentType}`
          );
        }

        if (error) {
          res.resume();
          reject(error);
        }

        res.setEncoding("utf8");
        let rawData = "";
        res.on("data".(chunk) = > (rawData += chunk));

        res.on("close".() = > {
          resolve(rawData);
        });
      })
      .on("error".(error) = > {
        reject(new Error("[E] Https.Get error: " + error));
      });
  });
}
Copy the code

Explanation:

A simple Node native request wrapped asynchronously to retrieve network data.

  • The introduction ofhttpspackage
  • Pass in the parameter URL
  • Encapsulate the callback function with Promise
  • Making a GET request
  • Check response for error handling
  • Check response status code and response format for error handling. If there is a mistakeres.resume()To consume the response data and clear the memory footprint.
  • Set the response to utF8 encoding
  • Listen to receive and concatenate data
  • The listener closes and returns data
  • Get request error handling

The new parser. Ts

import { getUrlData } from "./api";

export async function parseUrl(videoUrlAddress: string) {
  // eg https://www.acfun.cn/v/ac4621380?quickViewId=videoInfo_new&ajaxpipe=1
  const urlSuffix = "? quickViewId=videoInfo_new&ajaxpipe=1";
  const url = videoUrlAddress + urlSuffix;

  const raw: string = await getUrlData(url);

  // Split
  const strsRemoveHeader = raw.split("window.pageInfo = window.videoInfo =");
  const strsRemoveTail = strsRemoveHeader[1].split("</script>");
  const strJson = strsRemoveTail[0];

  const strJsonEscaped = escapeSpecialChars(strJson);
  /** Object videoInfo */
  const videoInfo = JSON.parse(strJsonEscaped);

  const ksPlayJson = videoInfo.currentVideoInfo.ksPlayJson;
  /** Object ksPlay */
  const ksPlay = JSON.parse(ksPlayJson);

  const representations: any[] = ksPlay.adaptationSet[0].representation;
  const urlM3u8s: string[] = representations.map((d) = > d.url);

  return urlM3u8s;
}

/** * remove some JSON escapes \\" -> \" ->" *@param str* /
function escapeSpecialChars(str: string) {
  return str.replace(/\\\\"/g.'\ \ "').replace(/\\"/g.'"');
}
Copy the code

Resolution:

  • The introduction ofgetUrlDataFunction to access network data
  • Accepts the parameter video address URL
  • Add the request URL suffix
  • Request raw data
  • Cut the head and cut the tail and getvideoInfoJson
  • Unescape THE Json string
  • Parsing Json to getvideoInfoobject
  • extractvideoInfoobjectksPlayJsonProperty, obtained by parsingksPlayobject
  • extractksPlayobjectrepresentationsProperty to get an array of different definitions
  • Returns an array of links in different articulations

Modify index.ts and add to main:

  const url = `https://www.acfun.cn/v/ac4621380`;
  console.log("\n------\nParse url");
  const m3u8Urls = await parseUrl(url);
  console.log("ok");
Copy the code

Explanation:

  • Example Video address
  • callparseUrlFunction parses the m3uu8 play address for all articulation.

Call aria2C multithreading to download m3U8 sharding

First, choose a high-definition M3U8 link download, such as 1080p, then download and parse the M3U8 file, extract the download link array, and finally call aria2C multithreaded download.

Parse the M3U8 file

Modify parser.ts and add:

export async function parseM3u8(m3u8Url: string) {
  const m3u8File = await getUrlData(m3u8Url);
  /** Separate the ts file link */
  const rawPieces = m3u8File.split(/\n#EXTINF:.{8},\n/);
  /** Filter header */
  const m3u8RelativeLinks = rawPieces.slice(1);
  /** modify tail to remove redundant tail terminator */
  const patchedTail = m3u8RelativeLinks[m3u8RelativeLinks.length - 1].split(
    "\n") [0];
  m3u8RelativeLinks[m3u8RelativeLinks.length - 1] = patchedTail;

  /** full link, directly add the m3u8Url common prefix */
  const m3u8Prefix = m3u8Url.split("/").slice(0, -1).join("/");
  const m3u8FullUrls = m3u8RelativeLinks.map((d) = > m3u8Prefix + d);
  /** aria2c (); /** aria2c (); After that, the URL argument) */
  const tsNames = m3u8RelativeLinks.map((d) = > d.split("?") [0]);
  /** Specifies the folder name. Delete the fragment number */ at the end of the file name
  let outputFolderName = tsNames[0].slice(0, -9);
  /** Outputs the last merged file name, with a universal mp4 suffix */
  const outputFileName = outputFolderName + ".mp4";

  return {
    m3u8FullUrls,
    tsNames,
    outputFolderName,
    outputFileName,
  };
}
Copy the code

Explanation:

  • Accepts parameter m3u8 file address URL
  • Download the M3U8 file
  • Detach the TS file link
  • Remove excess head
  • Modify tail to remove superfluous tail terminators
  • Get an array of relative addresses
  • Generates full address array, need to addm3u8UrlSame prefix
  • generatearia2cDownload the file name, which is the last part of the URL, remove the last url parameter (? After that, the URL parameter)
  • Generate the folder name, remove the shard number at the end of the file name
  • Generate the last merged file name with the universal MP4 suffix
  • Returns the generated information

Call aria2c multithreaded download

Install aria2c

sudo apt install aria2
Copy the code

New runShell. Ts

import { spawn } from "child_process";

/** * Run the shell command *@param Command Specifies the shell * to execute@param Args shell argument *@param Options shell Options *@description  Example: ` ` ` ts readUpdateOutputFromShell (" SAR ", [" -n ", "DEV", "1"]) ` ` ` * /
export const runShell = async (
  command: string.args: readonly string[].options: ShellOption
) =>
  new Promise((resolve, reject) = > {
    const runpProcess = spawn(command, args, {
      stdio: "inherit".cwd: options.cwd ? options.cwd : process.cwd(),
      env: process.env,
      detached: true});/** End processing */
    runpProcess.on("close".(code) = > {
      resolve();
    });
  });

typeShellOption = { cwd? :string;
};
Copy the code

Explanation:

A Node calls a child thread to execute a simple asynchronous encapsulation of system commands

  • The introduction ofspawnfunction
  • stdio: "inherit": Directly use the parent thread’s stdio, stderr.
  • cwd: options.cwd: Command execution path Set this parameter to the default path

A new video. Ts

import * as fs from "fs";
import * as path from "path";
import { runShell } from "./runShell";

export async function downloadM3u8Videos(
  m3u8FullUrls: string[],
  outputFolderName: string
) {
  // Check the download folder name
  if (outputFolderName == "") {
    throw new Error("[E] Download folder name is empty.");
  }
  /** The folder with the same name already exists and needs to be suffix _ to avoid conflicts */
  while (fs.existsSync(path.resolve(process.cwd(), outputFolderName))) {
    outputFolderName += "_";
    if (outputFolderName.length > 100) {
      throw new Error(
        "[E] Download folder exists and try to rename too many times."); }}/** Create a new download folder */ in the current running directory
  const outPath = path.resolve(process.cwd(), outputFolderName);
  fs.mkdirSync(outPath);

  /** Write the download link list file */
  fs.writeFileSync(path.resolve(outPath, "urls.txt"), m3u8FullUrls.join("\n"));

  /** aria2c multithreaded download */
  await runShell("aria2c"["-i"."./urls.txt"] and {cwd: path.resolve(outPath),
  });
}
Copy the code

Explanation:

  • Check download folder name, empty error
  • A folder with the same name already exists. Add the suffix _ to avoid conflicts
  • Create a download folder, folder creation in the current running directory
  • Write the download link list fileurls.txt
  • aria2creadurls.txtMultithreaded downloads to the download folder, and the number of downloads can be specified by the -j argument (default 5)

Add index. Ts to main

  console.log("\n------\nParse m3u8");
  const m3u8Url1080p = m3u8Urls[0];
  const info = await parseM3u8(m3u8Url1080p);
  console.log("ok");

  console.log("\n------\nDownload ts videos");
  const { m3u8FullUrls, tsNames, outputFolderName, outputFileName } = info;
  await downloadM3u8Videos(m3u8FullUrls, outputFolderName);
  console.log("ok");
Copy the code

Explanation:

  • Pick a link in 1080p clarity and parse the information needed for the download
  • Download all TS video files link to the corresponding download folder

Call FFMPEG to merge videos

Install ffmpeg

sudo apt install ffmpeg
Copy the code

Ffmpeg merges videos using the following instructions: Refer to the official wikiConcatenate – FFmpeg

ffmpeg -f concat -safe 0 -i ./files.txt -c copy outputFileName
Copy the code

Explanation:

  • ffmpeg: Uses FFMPEG to process video.
  • -f concat: Mandatory format Uses the virtual connection script decoder concat, which is used to read files of the same format, merge input and concatenate.
  • -safe 0: Ignores the file path security check.
  • -i ./files.txt: Input file containing multiple videos and parameters specified in the following format.
  • -c copy: Copies video and audio streams without processing.
  • outputFileName: The name of the output file.

TXT format is as follows

file /path/xxx1
file /path/xxx2
...
Copy the code

Explanation:

  • Specifies the file path to read, multiple files multiple lines split.

Edit Video.ts and add

export async function mergeVideo(
  tsNames: string[],
  outputFileName: string,
  outputFolderName: string
) {
  const outPath = path.resolve(process.cwd(), outputFolderName);

  /** Merge parameter list format file path */
  const concatStrs = tsNames.map((d) = > `file '${outPath}/${d}'`);
  /** Writes the merge parameter list file */
  fs.writeFileSync(path.resolve(outPath, "files.txt"), concatStrs.join("\n"));
  debugger;

  /** ffmpeg merge */
  await runShell(
    "ffmpeg"["-f"."concat"."-safe"."0"."-i"."./files.txt"."-c"."copy",
      outputFileName,
    ],
    { cwd: path.resolve(outPath) }
  );
}
Copy the code

Explanation:

  • Parameter: array of ts video file names, the file names of the merged output, the name of the merged folder
  • Gets the path to the download folder
  • Merge parameter list formatfile path
  • Write to the merge parameter list file
  • ffmpegmerge

Edit index.ts and add to main:

  console.log("\n------\nMerge video");
  await mergeVideo(tsNames, outputFileName, outputFolderName);
  console.log("ok");
Copy the code

Run.

Change to user input URL

Modified index. Ts:

  const url = process.argv[2];
  console.log("Your input: ", url);
  if (typeofurl ! = ="string") {
    console.log("[E] Url input required.");
    return;
  }
  if (url.match(/^https:\/\/www\.acfun\.cn\/v\/ac\d+$/) = = =null) {
    console.log(
      "[E] Url input invalid.Valid input example: https://www.acfun.cn/v/ac4621380"
    );
    return;
  }
Copy the code

Explanation:

  • Read the user input parameter URL
  • If there is no input, an error is returned
  • If the input format does not match, an error is returned

Final index.ts:

import { parseUrl, parseM3u8 } from "./parser";
import { downloadM3u8Videos, mergeVideo } from "./video";

async function main() {
  const url = process.argv[2];
  console.log("Your input: ", url);
  if (typeofurl ! = ="string") {
    console.log("[E] Url input required.");
    return;
  }
  if (url.match(/^https:\/\/www\.acfun\.cn\/v\/ac\d+$/) = = =null) {
    console.log(
      "[E] Url input invalid.Valid input example: https://www.acfun.cn/v/ac4621380"
    );
    return;
  }

  console.log("\n------\nParse url");
  const m3u8Urls = await parseUrl(url);
  console.log("ok");

  console.log("\n------\nParse m3u8");
  const m3u8Url1080p = m3u8Urls[0];
  console.log("[1080p] ", m3u8Url1080p);
  const info = await parseM3u8(m3u8Url1080p);
  console.log("ok");

  console.log("\n------\nDownload ts videos");
  const { m3u8FullUrls, tsNames, outputFolderName, outputFileName } = info;
  await downloadM3u8Videos(m3u8FullUrls, outputFolderName);
  console.log("ok");

  console.log("\n------\nMerge video");
  await mergeVideo(tsNames, outputFileName, outputFolderName);
  console.log("ok");
}

main().then();
Copy the code

At this point, the process is complete.

conclusion

This paper introduces the analysis of A station video API step by step, call aria2C multithreading download M3U8 fragment and call FFMPEG merge video, realize the use of Node to download Acfun video function.

Alternatively, FFMpeg itself can be single-threaded directly to download m3U8 links for merging, using the command:

ffmpeg -i 'https://xxx.m3u8' -c copy output.mp4
Copy the code

Convenient diagram simple children’s shoes to use.

If you don’t like this document, click the close button in the upper right corner. If you liked it and found it useful, feel free to like it and share your thoughts in the comments. Project source code, stamp ~czzonet/ ACfun -video- CLI

reference

  1. ffmpeg Documentation
  2. FFmpeg Formats Documentation
  3. Concatenate – FFmpeg
  4. Aria2c (1) — ARIA2 1.35.0 documentation
  5. HTTP | Node. Js v14.9.0 Documentation

Legal issues

The author assumes no responsibility if your use of the Software constitutes the basis for copyright infringement or if you use the Software for any other illegal purposes.

This software is distributed under the Apache-2.0 license.

In particular, please be aware that

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Translated to human words:

In case your use of the software forms the basis of copyright infringement, or you use the software for any other illegal purposes, the authors cannot take any responsibility for you.

We only ship the code here, and how you are going to use it is left to your own discretion.