preface

You dig friends, happy New Year, today is the first day of 2022, the growing popularity of the Denver nuggets author voting, and list has ended, anything to do with me, of course, all have no, my New Year’s Flag, is the nuggets rank to the V4, and for the vast majority of readers, whether there is “learning” in the New Year’s Flag this item, for me, Me, too. So I had an idea. I wanted to tally up the nuggets’ articles of the year.

  1. Bookmark your favorite articles and study them slowly.
  2. Is to learn through these articles, which articles are suitable for readers, where are the advantages of these articles? How should we write articles?

Annual active authors statistics

We can count the active authors of this year through the end of the year voting page, which is a scroll page, has_more to determine if there is a next page, so we can get all author ids through Nodejs.

const axios = require("axios");
const _ = require("lodash");
const fs = require("fs");

const url = "https://api.juejin.cn/list_api/v1/annual/list";

const headers = {
  "content-type": "application/json; charset=utf-8"."user-agent":
    "Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36".cookies:
    'xxx'};let userId = [];

const fetchUserId = (cursor = 0) = > {
  console.log("Request clause" + cursor+'pages');
  axios.post(
      url,
      { annual_id: "2021".list_type: 0.cursor: cursor + "".keyword: "" },
      {
        headers,
      }
    )
    .then((res) = > {
      const data = res.data;
      userId = userId.concat(_.map(data.data, "user_id"));
      if (data.has_more && userId.length < 1000) {
        fetchUserId(cursor + 10);
      } else {
        fs.writeFileSync("./0-1000.json".JSON.stringify(userId)); }}); }; fetchUserId();Copy the code

Cookies can be copied in the browser so that the top 1000 authors can be counted, in order to prevent the mining backend interface limitation. After three runs, the result shows that there are 2035 authors who have signed up this time. Of course, this data may not be accurate. Next, we can obtain the articles of each author according to all user ids.

Gets a list of articles for each author

We can get a list of each author’s articles from the poll details page. I have to make fun of the Nuggets interface, the front end only shows 3 articles, the back end gives all the data… 😅

Let’s take a look at each one:

The posts here are sorted by popularity by default, but we don’t know if they’re sorted by likes or favorites, we don’t know.

Fortunately, we can get the article from each gold digger from the reader’s page, as shown below:

Again, the user_info data is repeated N times. This interface has a few likes, comments, and favorites. What does digg_count mean? Which word is the prefix?

Build table statistics

Json is not possible to store such a large amount of data. I use PSQL and PRISma for ORM. For those who are not familiar with this, please refer to my previous translation article “Complete ORM prisma for Node.js and TypeScript”.

Establish a schema

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

generator client {
  provider = "prisma-client-js"
}

model Article {
  article_id    String           @id
  title         String
  brief_content String
  content       String?
  cover_image   String
  user_id       String
  ctime         String
  digg_count    Int
  view_count    Int
  comment_count Int
  collect_count Int
  author_id     String
  author        Author           @relation(fields: [author_id], references: [id])
  category_id   String
  category      Category         @relation(fields: [category_id], references: [id])
  tags          TagsOnArticles[]
}

model Author {
  id           String    @id
  name         String
  avatar_large String
  articles     Article[]
}

model Category {
  id       String    @id
  name     String
  articles Article[]
}

model Tag {
  id       String           @id
  name     String
  articles TagsOnArticles[]
}

model TagsOnArticles {
  article    Article @relation(fields: [article_id], references: [article_id])
  article_id String
  tag        Tag     @relation(fields: [tag_id], references: [id])
  tag_id     String

  @@id([article_id, tag_id])
}
Copy the code

Table relationships

  • Articles and users – many to one
  • Follow and categorize articles – many to one
  • Articles and tags — many to many

Gets the user’s article list code

/** * Get the user's article list *@param userId
 * @returns* /

const fetchList = async (userId: string) => {
  console.log("Start collecting" + userId);
  return new Promise((reslove) = > {
    setTimeout(async() = > {await axios
        .post(
          "https://api.juejin.cn/content_api/v1/article/query_list?aid=2608&uuid=6899676175061648910",
          {
            user_id: userId,
            sort_type: 1.cursor: "0",
          },
          {headers}
        )
        .then((res: any) = > {
          const data = res.data.data;
          if (data && data.length) {
            // Insert the database
            insert(data)
              .catch((e) = > {
                console.error(e);
                process.exit(1);
              })
              .finally(() = > {
                reslove("");
              });
          } else {
            reslove(""); }}); },2000);
  });
};
Copy the code

To prevent too many submissions, I set a 2 second delay on my side.

Insert database code

/** * Insert database *@param data* /
async function insert(data: any) {
  for (const item of data) {
    const article_info = _.pick(item.article_info, [
      "article_id"."title"."brief_content"."cover_image"."user_id"."ctime"."digg_count"."view_count"."comment_count"."collect_count",]);const author_user_info = await prisma.author.findUnique({
      where: {
        id: item.author_user_info.user_id,
      },
    });
    if(! author_user_info) {await prisma.author.create({
        data: {
          id: item.author_user_info.user_id,
          name: item.author_user_info.user_name,
          avatar_large: item.author_user_info.avatar_large,
        },
      });
    }

    const category = await prisma.category.findUnique({
      where: {
        id: item.category.category_id,
      },
    });

    if(! category) {await prisma.category.create({
        data: {
          id: item.category.category_id,
          name: item.category.category_name,
        },
      });
    }
    const article = await prisma.article.findUnique({
      where: {
        article_id: article_info.article_id,
      },
    });
    const creates_tags = _.map(item.tags, (tag: any) = > {
      return {
        tag: {
          connectOrCreate: {
            create: {
              id: tag.tag_id,
              name: tag.tag_name,
            },
            where: {
              id: tag.tag_id,
            },
          },
        },
      };
    });
    if(! article) {console.log("create---" + article_info.title);

      await prisma.article.create({
        data: {
          ...article_info,
          author_id: item.article_info.user_id,
          category_id: item.category.category_id,
          tags: {
            create: creates_tags, }, }, }); }}}Copy the code

FetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList fetchList

We can’t do it with promise.all, because promise.all will execute all promises synchronously, and the back end will reject your request to prevent overloading. We need to take every request, every 2s, and save it to the database. What method do we use? (This is a standard interview question. How do you make multiple promises work?) If you see someone here, leave a comment in the comments section.

The effect

After all the running is complete, we save all the articles of the author of the year to the database. Run the following command to view the data through Prisma Studio

npx prisma studio 
Copy the code

The query creation time is greater than 2021-01-01

new Date("2021/01/01").getTime() / / 1609430400000
Copy the code

In descending order of likes, we have our list of highly liked articles.

summary

Based on these results, I also concluded a few points, namely, how to write a great article?

  1. Have a broad readership

    Write ES6 > Vue > React as I did in my previous article how to Test React asynchronous components. , the amount of reading can be imagined, will certainly do not need to see your article, not also do not have this demand.

  2. The article must be easy to understand, must let the reader understand the knowledge point.

    As the author Lin Sanxin said

    It’s my motto to say the hardest things in the most common terms.

The last

Dear friends, have you understood my article? Please give me a thumbs-up. Your thumbs-up is the biggest support for me.

I hope this article was helpful to you, and you can also refer to my previous articles or share your thoughts and insights in the comments section.