This is the first day of my participation in the Gwen Challenge in November. Check out the details: the last Gwen Challenge in 2021.

Recently digging friends in the crazy sun their cats, at the same time, and a large number of digging friends in the crazy cloud suction cat!

Me too, but I don’t think it’s enough. It would be nice if I could get all your cats, all mine!

But this is not possible, very disappointed, how to do?

Got it!! Even if you can’t get your cat, take a step back and round it up to get your cat. Wit !!!!

So how do you get pictures of all the cats?

Right click one by one and save it as? Impossible! Let me try it!

The Nuggets definitely don’t want to lock me out of their IP. After all, I’m just a guy who wants to masturbate a cat.

Open dry!!!!!

Analysis of the website

Through packet capture, I found the request interface of boiling point, as follows:I’m going to hazard a guess,"theme_id": "7007350783603638279"Probably the “best thing about cat star people” topic,limitIs the number of boiling points per request,sortIt’s the hot and the latest,cursorIt’s basically a cursor, which one to start with.

By analyzing the return body, prove my conjecture is correct, ha!

Also, I found, in the return body, your username, the boiling point content, and the url link to the cat photo you posted! Excellent!!

So now, all we need is code. It’s all small stuff!

Write a program

Here, I seem to find something wrong!

The cursor said above is a “eyJ2IjoiNzAyMzU5NjI1MjkzNTc2NjA1MyIsImkiOjEyMH0 =”, what is this stuff, apparently encrypted!

So I started trying to decrypt it to see what was before it was encrypted. I found that Base64 through the way of encryption, wow, also too righteousness, I thought will be what strange strange encryption way, to this can not go down!

As follows, decrypted is{"v":"7023596252935766053","i":100}String, in my experience, insideiIt’s a cursor, which means where to start.Got it! Open dry!

Code implementation

Request data encryption

def base64_encrypt(text) :
    """ Base64 encryption :param text: text: return: ""
    encrypt = base64.b64encode(str(text).encode('utf-8'))
    return str(encrypt, "utf-8")
Copy the code

Single page crawl test

Set I to 0 at cursor data;

import base64
import requests


def jujin_cat_spider() :
    headers = {
        'User-Agent': 'the Mozilla / 5.0 (Windows NT 10.0; Win64; X64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'the Chrome / 81.0.4044.43 Safari / 537.36'.'Referer': 'https://juejin.cn/'
    }
    url = "https://api.juejin.cn/content_api/v1/short_msg/list_by_theme?aid=2608&uuid=6983930498641806848"

    cursor = '{"v":"7023596252935766053","i":0}'

    cursor = base64_encrypt(cursor)
    print("After encryption:", cursor)

    data = {
        "theme_id": "7007350783603638279"."sort": 1."limit": 20."cursor": cursor
    }
    res = requests.post(url, json=data, headers=headers)
    print(res.json())


jujin_cat_spider()
Copy the code

After testing it and running perfectly, you can get the first page of data. As mentioned above, the pic_list inside msg_Info is a list of cat photos, loop through all the boiling point data, and then loop through the list of photos to download the photos.

I feel victory is just around the corner, so here’s an early celebration

Flip up to take

If there is a next page, add I 20 to the cursor data and ask for the next page. If has_MORE is Fslse, stop. If has_more is Fslse, stop.

Running process screenshot:

Done!! There are hundreds of photos of cats, here are some screenshots to see if there is one of yours.

The complete code will not put, after all, I have to mix in the Nuggets, in case of bad influence, bang! Write me a letter! How can I still dig gold fishing hahaha!

Original is not easy, if small partners feel helpful, please click a “like” and then go ~

Finally, thank my girlfriend for her tolerance, understanding and support in work and life!