planning
Recently, I had a discussion with several teachers in our group about the ranking of github’s followers in China. You Yuxi ranked first with 7.4W followers, Ruan Yifeng ranked second with 6.7W followers, and Liao Xuefeng ranked third with 3.4W followers.
In addition, Yuxi is the first in China and the second in the world. No. 1 in the world is Linus, the founder of Linux
Although there have been a lot of websites on the market to do this ranking statistics, but I always feel what shortcomings, such as:
- Historical statistics
If it takes long enough, we can even recover a mileage map of the Open source community in China. I have to crawl the data myself, otherwise I have to wait for someone else to do it.
- View your rankings and generate posters
If I want to share my ranking, I can only take screenshots, rather than create a nice poster myself.
design
Search the leaderboards on Petal or Dribbble
Find a style that works and refer to the layout. As an experienced front-end, you already have CSS in mind when you look at various designs, right?
Selection of crawler scheme
As always, I don’t want to use my own server for this crawler. I want to keep it running as regularly as possible in a free and stable cloud service.
- Solution 1: uniCloud cloud service
advantages | Can schedule tasks, can directly connect to the cloud database, cloud storage |
disadvantages | Timeout time is only 60 seconds, and github thousands of data crawlers, it takes about 2 minutes to finish crawling. And accessing the Github API from a domestic server is not stable. |
- Option 2: Github Actions
advantages | Can be scheduled tasks, with github services to climb github data very fast |
disadvantages | Cannot directly relate to my cloud database in uniCloud |
Since the timeout of plan 1 is fixed and relatively short, sticking with it can also design a crawl scheme, but it is too cumbersome. Take this opportunity to learn about GithubActions, and send the data back to uniCloud cloud function after crawling, and then input the data into the database through the cloud function.
To get it
Nodejs calls to the Github API are relatively simple, with only two points to note.
- Users can search for a maximum of 1,000 items of data
- Access to the frequency
The basic licensing mode only allows single-digit requests per minute. We need to climb to the top 1000, which is not enough. We need to go to Github to generate our own TOKEN, and the request frequency of using TOKEN suddenly reaches 5000 times per minute, which is completely enough.
Bring your own token into the header
async function githubApiGet(url,data){
if(! data)data={};let res;
try{
res = await axios.get(url,{
headers: {"Authorization":'Token '+token,
"Accept":"application/vnd.github.v3+json"
},
params:data
})
}catch(err){
console.log(err);
}
return res;
}
Copy the code
Since I want to record the ranking history in China, the number of files in this warehouse may be incomparably large in the future. Because one is being made every day. So you have to design the directory structure
Separate the year and month into separate directories, so that the maximum number of files in a single directory is 31.
Github Actions
See the comments in the code block for the purpose of the configuration item, and here I only show the parameters I use.
# The name of this workflow
name: ROCSchedule
on:
# Regular tasks, using international standard time, 0 18 this setting stands for 2 am in China
schedule:
- cron: '0 18 * * *'
Whether to manually trigger workflow in the warehouse panel, yes.
workflow_dispatch:
jobs:
build:
# Running Linux system
runs-on: ubuntu-latest
steps:
# Fetch the code from the repository
- uses: actions/checkout@v2
Set up the Node environment
- name: Setup Node.js environment
uses: The actions/[email protected]
# NPM install install dependencies
- name: Install NPM dependencies
run:
npm install axios
# Put our code to work
- name: Run
run:
MYTOKEN is used to solve the problem of github API request frequency, POSTURL is used to send data after the completion of the crawl interface URL
MYTOKEN=${{secrets.MYTOKEN}} POSTURL=${{secrets.POSTURL}} node index.js
- name: Add & Commit
Since we will generate a JSON file in the warehouse, we need to push the warehouse
uses: EndBug/[email protected]
with:
github_token: ${{secrets.MYTOKEN}}
Copy the code
Secrets
My MYTOKEN and POSTURL should not be disclosed, but I use them in my project, so how can I open source? GithubActions’ Secrets variable works so well that I can open source the entire project without compromising my own private data. Users of each fork can also build services using their own tokens
UniCloud Cloud function receives POST
Create a uniCloud cloud function and turn on urlization.
After urlization, the cloud function receives the parameters in the event. Body, which can be used by using JSON parsing
if(event.body){
//github action post
const data = JSON.parse(event.body);
await db.collection('githubroc').add({
record_date:data.record_date,
total_users:data.total_users,
rank_list:data.rank_list
});
return;
}
Copy the code
Obtain ranking data from the cloud database
We look for matches in the database in the format of the current date 2021-09-11. If the specified date is not matched, the latest entry is taken. There has to be at least one, so I’m not going to make a judgment about none
const date = event.date;
dbRes = await db.collection('githubroc').where({
record_date:dbCmd.eq(date)
}).get();
if(dbRes.affectedDocs<=0){
dbRes = await db.collection('githubroc').limit(1).get();
}
return utils.responseData(0."",dbRes.data[0]);
Copy the code
Small program long list rendering
Since the storage structure of the database is not suitable for paging, the total number of entries is only 1000, so let’s put it to the front end. But if you setData with 1000 pieces of data ata time, it’s still going to be slow. So only 100 entries are displayed the first time, scroll to the bottom and concat the next 100 entries.
this.ranklist = this.ranklist.concat(this.orginlist.slice(this.rankpage*100, (this.rankpage+1) *100));
Copy the code
Small program poster generation
The most troublesome part of the whole case is the poster. It is not difficult to draw the poster completely in front of canvas. It is very trivial, so we need to draw the layout bit by bit with Canvas API.
Wx. DownloadFile also needs to configure the request whitelist in the background of the small program management, and the configuration requires the domain name to be configured with a domestic record…
The Github avatar is an address
https://avatars2.githubusercontent.com/u/499550?s=140
Copy the code
When I configured it to the whitelist, I got this hint
Are you ripping me off?!
Where am I going to file for GithubuserContent.com?
I am a person from planning, design, development to do a day and night, before the development of the URL whitelist was not detected, so the problem was not found at the earliest. In the end, the poster didn’t work? Unable to publish? I got 10,000 horses running inside me!
It must be done!
Cloud function urlization to transfer avatar
Here I think about two approaches
-
When you get the data from Github, move all the avatars to the cloud and replace the original URL
-
I read the avatar in my cloud function and return the image, without local storage, which is equivalent to using the cloud function to do a transfer.
I chose plan 2. Let’s do it!
Scheme 2 uses the GET request of the cloud function, and the transfer mode is as follows
Cloud function url? Url = head is making the urlCopy the code
The cloud functions handle HTTP GET request input parameters
if(event.queryStringParameters){
//github avatar_url fix
var qs = event.queryStringParameters;
var imageRes = await uniCloud.httpclient.request(qs.avatar_url);
let buff = new Buffer(imageRes.data);
let base64data = buff.toString('base64');
return {
mpserverlessComposedResponse: true.// Using Ali Cloud to return the integration response requires this field to be true
isBase64Encoded: true.statusCode: 200.headers: {
'content-type': 'image/jpeg'
},
body: base64data
}
}
Copy the code
Afterword.
A person from planning, design, development of a dragon to complete such a small function, and climbed a lot of pits, thinking and solving some difficulties, very sense of achievement.
Open source
Github.com/ezshine/git…
Warehouse instructions video tutorial
Small program part of the technical implementation is not difficult, all technical difficulties are explained in the article, we can achieve their own, here I mainly open source Github action and ranking json file, please friends click follow bar ~
I want to be ranked, too! I want to be ranked, too! I want to be ranked, too!
The design and development time is not as long as the final arrangement of this article. It is not easy to write and share this article. Please give encouragement!
- Hand to hand teach you to do iOS reverse analysis, break through wechat group send multiple selection limit of 12 likes
- 🎑 in advance I wish you a happy Mid-Autumn Festival, teach you to do a [Mid-Autumn lantern wish] 💖 website 69 praise
- Product Manager: Can you use div to draw me a dragon? 2373 great
- Three kinds of front end to achieve VR panorama house! You might need it some day! 2643 great