background
In February, yuan God fell into a pit. The fan chart of the card drawing record page of Yuan God did not have a summary and the Table was fixed to 6. It is very troublesome to calculate the number of card drawing, the frequency of each shipment and the distance of the next guarantee. So I plan to make a web page, used to obtain the original god card data, made for more clear statistics + fan-shaped display of the first result graph ~
Preparatory work
To obtain the URL of the original Magic card page, open the original Magic card page on the mobile phone, open the flight mode, refresh the page to get an error page, copy down its URL
Several attempts to obtain data
Static crawler
The first thing I did was to go to Baidu and read the online saying that crawling a web page only needs to use HTTP or some third-party libraries to visit the corresponding page and get the elements in the DOM node. So BEGAN my first attempt, I used the Superagent library
// Import the required third-party packages
const superagent= require('superagent');
let recordData = [];
superagent.get('target URL').end((err, res) = > {
if (err) {
console.log('Draw card record fetching failed -${err}`)}else {
// Grab hot news data
recordData = getRecord(res);/ / from the res}});const cheerio = require('cheerio');
let getRecord = (res) = > {
let recordData = [];
/* Using the Cheerio module's cherrio.load() method, passing HTMLdocument as an argument to the function can use $(selectior) like jQuery to get page elements */
let $ = cheerio.load(res.text);
// Find the page element where the target data resides
$('.table-content>div>type').each((idx, ele) = > {
Cherrio $('selector').each() is used to iterate over all matched DOM elements
// Idx is the index of the element being traversed, and ele is the convenient DOM element
recordData.push({
type:$(ele).text()
}) // Store the final result array
});
/ /... Fill the array with additional data
return recordData
};
Copy the code
Step by step so embarrassing things happened, I found the log return res inside there just isn’t the internal elements of form, after know because the Table data access is the request of another interface to get, there are two solutions for this kind of dynamic page Adopted to simulate the browser or interface, through the analysis of the caught the request itself.
Mock browser
I used the Puppeteer library to simulate different requests for button and selector clicks to get the final data
const puppeteer = require("puppeteer");
const TEST_URL =""
const getTargetData = async() = > {// Start the browser
const browser = await puppeteer.launch({
headless: false.// The default is headless mode, so use normal mode for demonstration purposes
});
// Control the browser to open a new TAB page
const page = await browser.newPage();
page.setViewport({
width: 0.height: 1500});// Open the page to be climbed in a new TAB
await page.goto(TEST_URL);
// Store data
const types = [];
const names = [];
const dates = [];
let poolSelect = [
{ poolName: "Permanent".records: []}, {poolName: "New".records: []}, {poolName: "Role".records: []}, {pullName: "Arms".records: []},];return await new Promise((resolve, reject) = > {
const loadData = async (index) => {
let end = false;
await page.waitForTimeout(600 * index);
await page.click(".select-container");
await page.waitForSelector(".ul-list");
await page.click(`.item:nth-child(${index+1}) `);
await page.waitForTimeout(300);
while(! end) {const { currentData, length } = await page.evaluate(async() = > {// A delay is required because the request time is required
const getData = async() = > {const currentTypes = document
.querySelectorAll(".table-content>div") [1]
.querySelectorAll(".type");
const currentNames = document
.querySelectorAll(".table-content>div") [1]
.querySelectorAll(".name");
const currentTimes = document
.querySelectorAll(".table-content>div") [1]
.querySelectorAll(".time");
const currentData = [...currentTypes].map((item, index) = > {
const [name, level] = currentNames[index].innerHTML
.split("\n")
.filter(
(item) = >item ! =undefined&& item ! =null&& item.trim().length ! = =0
)
.map((item) = > item.trim());
return {
type: currentTypes[index].innerHTML,
name,
level: level ? level : "(samsung)".time: currentTimes[index].innerHTML,
};
});
console.log(currentData);
return {
length: currentTypes.length,
currentData,
};
};
let { length, currentData } = await getData();
console.log(currentData);
return{ length, currentData, }; }); poolSelect[index] = { ... poolSelect[index],records: [...poolSelect[index].records, ...currentData],
};
if (length < 6) {
end = true;
if (index >= poolSelect.length) {
break;
}
loadData(index + 1);
} else {
page.click(".page-item.to-next.selected");
await page.waitForTimeout(300); }}}; loadData(0);
resolve(poolSelect);
})
.then((res) = > console.log(res))
.catch((err) = > console.log(err));
};
Copy the code
It is possible to obtain some data this time, but maybe because of my network, some data with slow request speed is lost, and the waiting time after clicking is increased to obtain the data correctly, which leads to a long time to crawl.
Packet capture analysis interface
Catch bag I take is Mac Charles blue and white China. Visit the target website to capture packets
I stepped in some holes
- The captured packet access display is
<unknown>
.
This is because the url of the original magic card is HTTPS. SSL Proxying needs to be configured
Help>SSL Proxying>Install Charles Root Certificate
Then, inKey string
For certificate ConfigurationAlways trust
- This is what I found, but I still saw it last night
unknown
“And later found it neededProxy>SSL Proxy Setting
Set in the
- With that done, you can start analyzing the packets
Formal code
const router = require("koa-router") ();const https = require("https");
const queryString = require("query-string");
const targetUrl =
"https://hk4e-api.mihoyo.com/event/gacha_info/api/getGachaLog";
router.get("/test".async (ctx, next) => {
const { url } = ctx.request.query;
const { query } = queryString.parseUrl(url);
// Novice 100 Resident 200 character 301 Weapon 302
const gacha_types = [100.200.301.302];
const page = 1;
const size = 6;
const end_id = 0;
const createReq = (page, size, gacha_type, end_id) = > {
return new Promise((resolve, reject) = > {
If the end_id of the first request is 0, the end_id of each next page is the last one of the previous page. If the length of the returned result is smaller than size, the recursion ends
consttargetQuery = { ... query, gacha_type, page, size, end_id };const finalUrl = `${targetUrl}?${queryString.stringify(targetQuery)}`;
https.get(finalUrl, (res) = > {
let info = "";
res.on("data".function (chunk) {
info += chunk;
});
res.on("end".async function (err) {
const resultList = JSON.parse(info).data.list;
if (resultList.length < size) {
console.log('==== is requesting ==${gacha_type}= = page${page}`);
return resolve(resultList);
}
// end_id The last one on the previous page
const afterResultList = await createReq(page + 1, size, gacha_type, resultList[resultList.length-1].id)
resolve([
...resultList,
...afterResultList
]);
});
});
});
};
ctx.body = await new Promise((resolve) = > {
const promiseList = gacha_types.map((gacha_type) = > {
return new Promise(async (resolve) => {
const data = await createReq(page, size, gacha_type, end_id);
resolve({
gacha_type,
data: (await createReq(page, size, gacha_type, end_id)).reverse(),
});
});
});
resolve(
Promise.all(promiseList).then((values) = > {
console.log(values);
returnvalues; })); }); });Copy the code
The front-end coding
Use ANTD + UMI + AXIos +echarts for front-end page Code.
//App.js
export default function() {
const [url, setUrl] = useState(' ');
const [loading, setLoading] = useState(false);
const [result, setResult] = useState({});
const poolNameMap = {
newHandPool: {
gacha_type: 100.name: 'Novice Pool',},alwaysPool: {
gacha_type: 200.name: 'Resident pool',},rolePool: {
gacha_type: 301.name: 'Role pool',},armsPool: {
gacha_type: 302.name: 'Weapons pool',}};const handleSearch = async () => {
setLoading(true);
const { data } = await Axios.get('/test', { params: { url } });
setResult(groupBy(data, 'gacha_type'));
setLoading(false);
};
useEffect(() = > {
console.log(result);
}, [result]);
return (
<Spin tip="Loading... It can take tens of seconds to read the data." delay={200} spinning={loading}>
<div
style={{
width: '100vw',
display: 'flex',
flexDirection: 'column',
alignItems: 'center',
justifyContent: 'center'}} >
<div style={{ display: 'flex', width: '30% ',marginBottom: 16}} >
<Input value={url} onChange={setUrl} />
<Button
type="primary"
onClick={()= >{ handleSearch(); }} style={{marginLeft: 8}} > query</Button>
</div>
<Row gutter={[16, 16]} style={{ width: '40'}} % >
<Col span={24}>
<PoolCard
name={poolNameMap.alwaysPool.name}
data={get(result,` ${poolNameMap.alwaysPool.gacha_type}.0.data`)} / >
</Col>
<Col span={24}>
<PoolCard
name={poolNameMap.rolePool.name}
data={get(result,` ${poolNameMap.rolePool.gacha_type}.0.data`)} / >
</Col>
<Col span={24}>
<PoolCard
name={poolNameMap.armsPool.name}
data={get(result,` ${poolNameMap.armsPool.gacha_type}.0.data`)} / >
</Col>
<Col span={24}>
<PoolCard
name={poolNameMap.newHandPool.name}
data={get(result,` ${poolNameMap.newHandPool.gacha_type}.0.data`)} / >
</Col>
</Row>
</div>
</Spin>
);
}
Copy the code
// PoolCard.jsx
import React, { useState, useEffect } from 'react'
import { Card, Empty, Tag } from 'antd'
import { isNil, filter, get, map } from 'lodash'
import PieChart from '.. /PieChart'
const PoolCard = ({ name, data }) = > {
const [goldCards, setGoldCards] = useState([]);
const [noGoldTimes, setNoGoldTimes] = useState();
useEffect(() = > {
setGoldCards(get5LevelDetail(data))
}, [data])
const get5LevelDetail = (data) = > {
if (isNil(data)) {
return;
}
let lastLocation = 0;
let goldThings = []
data.forEach((item, index) = > {
if (get(item, `rank_type`) = = ='5') {
goldThings.push(
{
nums: index - lastLocation + 1.datail: item,
}
)
lastLocation = index + 1;
}
})
setNoGoldTimes(data.length - lastLocation);
return goldThings;
}
const getChartData = (data) = > {
if (isNil(data)) {
return;
}
let level_5 = 0;
let level_4 = 0;
let level_3 = 0;
data.forEach((item, index) = > {
if (get(item, `rank_type`) = = ='5') {
level_5++;
}
if (get(item, `rank_type`) = = ='4') {
level_4++;
}
if (get(item, `rank_type`) = = ='3') { level_3++; }})return [{
name: 'five-star'.value: level_5
}, {
name: 'four'.value: level_4
}, {
name: 'samsung'.value: level_3
}];
}
return (
<Card style={{ position:'relative' }}>
<div>
<h3>{name}</h3>
<div style={{ display: 'flex', alignItems: 'center' }}>{! isNil(noGoldTimes) ?<div>already<Tag color='processing' style={{ marginLeft: 8}} >{noGoldTimes}</Tag>Make no gold, far from the big bottom<Tag color='warning' style={{ marginLeft: 8}} >{90 - noGoldTimes}</Tag>hair</div> : null}
</div>
</div>{! isNil(data) ?<div style={{position:'absolute',right:0.top:0.bottom:0.margin:'auto 0'}} >
<PieChart name={name} data={getChartData(data)} />
</div>: null} {! isNil(data) ?<>
<div style={{ display: 'flex', justifyContent: 'space-between',maxWidth:300}} >
<div>
<p>Extraction times :{data.length}</p>
<p>Five-star times:<span style={{ marginRight: 8}} >
{
get(goldCards, 'length')
}
</span>
{
map(goldCards, goldCard => <Tag color='success'>
{goldCard.datail.name}({goldCard.nums})
</Tag>)}</p>
<p>{filter(data, item => get(item, 'rank_type') === '4',).length}</p>
<p>{filter(data, item => get(item, 'rank_type') === '3',).length}</p>
</div>
</div>
</> : (
<Empty />
)}
</Card>
)
}
export default PoolCard
Copy the code
// PieChart.jsx
import React, { useEffect, useState } from 'react'
import ReactEcharts from 'echarts-for-react';
import config from './config'
const colorMap = {
'five-star': 'yellow'.'four': 'purple'.'samsung': 'blue'
}
const RecordPieChart = ({ name, data }) = > {
const [options, setOptions] = useState({});
useEffect(() = > {
config.title.text = ' ';
config.legend.data = data.map(item= > {
return {
name: item,
textStyle: {
color: colorMap[item.name]
}
}
});
config.series[0].data = data;;
setOptions(config);
}, [name, data])
return (
<div>
<ReactEcharts style={{ width: 400.height: 250 }} option={options} />
</div>)}export default RecordPieChart
Copy the code
summary
After a day of efforts, finally can happily check their own card records, ease of card. so
Even if no one applauds you at the end, you should take a curtain call gracefully and thank yourself for your hard work