Main reference

If you want to know the rules of video parsing, you can read the following blog post, but the example is in Python, and the method of getting the id of the video has changed, so it can’t parse the video correctly.

Refer to the blog

Video resolution

First, read the HTML content and obtain the videoId

Here we read the contents of the HTML, such as:

Toutiao.com/group/66310…

By looking at the source code of the website, we can see that videoId is included in Js

How do we get the videoId value? Here we match the videoId in the page by regular matching

 Pattern pattern = Pattern.compile("videoId: '(.+)'");
 Matcher matcher = pattern.matcher(response);
 if(matcher.find()) { String videoId = matcher.group(1); . }Copy the code

Remember videoId: there is a space behind it. If there is less space, there will be no match. I was depressed for a long time because of this, and finally found that there was a space missing.

2. Construct r and S parameters

The r parameter is a random number, which can be any number of digits. Here we generate a 16-bit random number:

String r = getRandom(); 7805700526977788 // Generate a 16-bit random number private StringgetRandom() {
        Random random = new Random();
        StringBuilder result = new StringBuilder();
        for (int i = 0; i < 16; i++) {
            result.append(random.nextInt(10));
        }
        return result.toString();
    }
Copy the code

The parameter s is encrypted by CRC32.

/video/urls/v/1/toutiao/mp4/videoid? R = random number

As in the above example, videoId v02004040000bg31ot72gddgigkg7kvg

R is 7805700526977788

Then the encrypted text reads:

/video/urls/v/1/toutiao/mp4/v02004040000bg31ot72gddgigkg7kvg? r=7805700526977788

The generated code for parameter S is as follows:

CRC32 crc32 = new CRC32(); String s = String.format(ApiConstant.URL_VIDEO, videoId, r); // Perform CRC32 encryption. crc32.update(s.getBytes()); String crcString = crc32.getValue() +""; / / 38456043Copy the code
 public static final String URL_VIDEO="/video/urls/v/1/toutiao/mp4/%s? r=%s";
Copy the code

Initiate a request to obtain the video address

With the above videoId and the r and S parameters, we can make a request to get the real address of the video as follows:

I.snssdk.com/video/urls/…

The above example builds the following link:

I.snssdk.com/video/urls/…

The requested JSON is shown below:

Here, there is a video_list node, which contains video_1, video_2, and video_3. Here, there is only one video_1. The main_url in the video node is the real address of the video, but it is encrypted by Base64. Here we need to decrypt it:

private String getRealPath(String base64) {
    return new String(Base64.decode(base64.getBytes(), Base64.DEFAULT));
}
Copy the code

After decryption, the real address of the video is obtained as follows:

V6-tt.ixigua.com/video/m/220…

So far, the analysis of toutiao method has been analyzed, if you want to view the source code, you can refer to my project (at the end of the article), but also hope that if help you, please help me star, thank you.

The specific code inside parses the part of the video, the VideoPathDecoder class’s decodePath method:

The source address

Copy today’s headlines