First, shengun Town building
background
Recently the boss fell in love with eating chicken (mobile games: all forces attack), often pull us open black, can only give up the lunch break time, accompany the boss in the desert. Last week, when I was watching achievements in the game channel of wechat, I had an idea that I could capture a lot of battle data in this way, and then analyze to see what rules there are.
Show a wave of achievements, under the condition of black, our team ate chicken very high rate, nearly 100 times of eating chicken 51 times
We evaluated it briefly, decided it would work, and let’s go.
Step 1 Analyze the data interface
The first step, of course, is to collect these stats. First we need to understand the story behind the page. Check out how the page gets combat data.
Use Charles to capture packets
Caught implementation
In Mac, it is recommended to use Charles to capture the traffic on the mobile phone from the protocol layer. The principle is to open a proxy server on the Mac, and then set the network proxy of the mobile phone to Mac, so that all the traffic on the mobile phone will pass through our proxy server. The general process is as follows:
Processing HTTPS encrypted traffic
In the actual operation, we found that all traffic of wechat went through HTTPS, resulting in the encrypted data we caught, which has no reference significance for us. After research, you can install Charles root certificate on both mobile phones and computers to analyze Https traffic. For details, please refer to:
- Charles HTTPS packet capture for MAC and iPhone
- Charles failed to capture Https requests in iOS 11
After installing the certificate, our traffic looks something like thisAfter the above configuration, we can read the HTTPS request and response data, as shown in the figure below.
- You can do the same thing with Findler on Windows
- In fact, this is a very typical man in the middle scenario
Data interface
Next, we will find out the interfaces we need according to these data. After analysis, three interfaces are mainly involved
- Interface to get user information
- Get user record list interface
- Interface for obtaining details of user – specified battles
Let’s look at them one by one
1. Interface for obtaining user information
- request
API | /cgi-bin/gamewap/getpubgmbattlelist |
---|---|
methods | GET |
parameter | Openid, pass_ticket, plat_id, after_time, and limit |
cookie | Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID |
- response
{ "user_info": { "openid": "oODfo0pjBQkcNuR4XLTQ321xFVws", "head_img_url": "Http://wx.qlogo.cn/mmhead/Q3auHgzwzM5hSWxxxxxUQPwW9ibxxxx9DlxLTsKWk97oWpDI0rg/96", "nick_name" : "hope", "role_name" : "xxxx", "zone_area_id": 0, "plat_id": 1 }, "battle_info": { "total_1": 75, "total_10": 336, "total_game": 745, "total_kill": 1669 }, "battle_list": [{ "map_id": 1, "room_id": "6575389198189071197", "team_id": 57, "dT_event_time ": 1530953799," rank_IN_ds ": 3, "times_kill": 1, "label": "top five ", "team_type": 1, "award_gold": 677, "mode": 0 }], "appitem": { "AppID": "wx13051697527efc45", "IconURL": "https://mmocgame.qpic.cn/wechatgame/mEMdfrX5RU0dZFfNEdCsMJpfsof1HE0TP3cfZiboX0ZPxqh5aZnHjxPFXUGgsXmibe/0", "Name": "Desperately Army attack ", "BriefName" : "desperately Army attack ", "Desc", "the official licensed mobile game desperately," and "Brief" : "gun | 808.2 M", "WebURL" : "https://game.weixin.qq.com/cgi-bin/h5/static/detail_v2/index.html?wechat_pkgid=detail_v2&appid=wx13051697527efc45&show_ bubble=0", "DownloadInfo": { "DownloadURL": "https://itunes.apple.com/cn/app/id1304987143", "DownloadFlag": 5 }, "Status": 0, "AppInfoFlag": 45, "Label": [], "AppStorePopUpDialogConfig": { "Duration": 1500, "Interval": 172800, "ServerTimestamp": 1531066098 }, "HasEnabledChatGroup": false, "AppType": 0, "game_tag_list": [" jedi "to survive," the original reduction ", "friends open black", "one hundred people against", "large map]", "recommend_reason" : "the original desperately, wilderness shooting," "size_desc" : "808.2 M"}, "is_guest" : true, "is_blocked": false, "errcode": 0, "errmsg": "ok" }Copy the code
2. Obtain the user record list interface
- Analysis of the
The OpenID is the unique identity of the user.
2. Obtain the user record list interface
- request
API /cgi-bin/gamewap/getpubgmbattlelist methods GET parameter Openid, pass_ticket, plat_id, after_time, and limit cookie Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID - response
{ "errcode": 0, "errmsg": "ok", "next_after_time": 1528120556, "battle_list": [{ "map_id": 1, "room_id": "6575389198111172597", "team_id": 57, "dt_event_time": 1530953799, "rank_in_ds": 3, "times_kill": 1, "label": "Top five", "team_type" : 1, "award_gold" : 677, the "mode" : 0}, {" map_id ": 1," room_id ":" 6575336498940384115 ", "team_id" : 11, "dT_event_time ": 1530941404," rank_IN_DS ": 5, "times_kill": 2, "label": "top five ", "team_type": 1, "award_gold": 632, "mode": 0 }], "has_next": true }Copy the code
- Analysis of the
- This interface uses after_time for pagination and traversal to determine whether there is data for the next page based on has_next and next_AFTER_time of the interface response.
- The Room_id in the list is the unique id for each battle.
3. Interface for obtaining user record details
- request
API | /cgi-bin/gamewap/getpubgmbattledetail |
---|---|
methods | GET |
parameter | Openid, pass_ticket, and room_id |
cookie | Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID |
- request
{" errcode ": 0," errmsg ":" ok ", "base_info" : {" nick_name ":" pomelo tea ", "head_img_url" : "http://wx.qlogo.cn/mmhead/xxxx/96", "dt_event_time": 1528648165, "team_type": 4, "rank": 1, "player_count": 100, "role_sex" : 1, "label" : "good luck", "the openid" : "oODfo0s1w5lWjmxxxxxgQkcCljXQ"}, "battle_info" : {" award_gold ": 622, "times_kill": 6, "times_head_shot": 0, "damage": 537, "times_assist": 3, "survival_duration": 1629, "times_save": 0, "times_reborn": 0, "vehicle_kill": 1, "forward_distance": 10140, "driving_distance": 5934, "dead_poison_circle_no": 6, "top_kill_distance": 223, "top_kill_distance_weapon_use": 2924130819, "be_kill_user": { "nick_name": "Asahi ", "head_img_url": "http://wx.qlogo.cn/mmhead/ibLButGMnqJNFsUtStNEV8tzlH1QpwPiaF9kxxxxx66G3ibjic6Ng2Rcg/96", "weapon_use": 20101000001, "the openid" : "oODfo0qrPLExxxxc0QKjFPnPxyI}", "label" : "prosperous"}, "team_info" : {" user_list ": [{" nick_name" : "ooo", "times_kill": 6, "assist_count": 3, "survival_duration": 1638, "award_gold": 632, "head_img_url": "http://wx.qlogo.cn/mmhead/Q3auHgzwzM4k4RXdyxavNxxxxUjcX6Tl47MNNV1dZDliazRKRg", "openid": ["nick_name": "I eat fried meat ", "times_kill": 2, "assist_count": 2, "survival_duration": 1502, "award_gold": 583, "head_img_url": "http://wx.qlogo.cn/mmhead/sTJptKvBQLKd5SAAjOF0VrwiapUxxxxFffxoDUcrVjYbDf9pNENQ", "openid": "oODfo0gIyDxxxxZpUrSrpapZSDT0" }] }, "is_guest": true, "is_blocked": false }Copy the code
- Analysis of the
- This interface responds to detailed combat information, including kills, headshots, saves, distance run, etc., enough for us to analyze.
- The interface also responds to the openID of the group member and who killed it, allowing us to crawl more user data with infinite depth of spread.
As for information such as pass_ticket in cookie, it must be used for permission authentication. These information has not changed in the above several requests, so we do not need to study how to calculate it deeply. We only need to capture the packet and extract the default information and fill it into the code to use it.
Step 2 Crawl data
Now that the interface has been identified, it’s time to grab enough data.
Get the data using the Requests request interface
url = 'https://game.weixin.qq.com/cgi-bin/gamewap/getpubgmdatacenterindex?openid=%s&plat_id=0&uin=&key=&pass_ticket=%s' % (openid, settings.pass_ticket) r = requests.get(url=url, cookies=settings.def_cookies, headers=settings.def_headers, Timeout =(5.0, 5.0)) TMP = r.json() wfile = os.path.join(settings.res_userInfo_dir, '%s.txt' % (rediskeys.user(openid))) with codecs.open(wfile, 'w', 'utf-8') as wf: wf.write(simplejson.dumps(tmp, indent=2, sort_keys=True, ensure_ascii=False))Copy the code
This way we can quickly write the other two interfaces.
Use Redis to mark information that has been crawled
In the above interface, we may find the OpenID of user B from the entrance of user A, and then find the OpenID of user A from the entrance of user B. In order to avoid repeated collection, we need to record which information we have collected. Core snippet
Def user_battle_list(openID) def user_battle_list(openID): Return 'ubl_%s' % (openID) # If settings.dataredis. Get (rediskeys.user_battle_list(openID))) Set (rediskeys.user_battle_list(openID), 1)Copy the code
Use celery to manage queues
Celery is a very useful distributed queue management tool, I’m only going to run celery on my own computer this time so I’m not using the distributed feature. We create three tasks and three queues
task_queues = (
Queue('queue_get_battle_info', exchange=Exchange('priority', type='direct'), routing_key='gbi'),
Queue('queue_get_battle_list', exchange=Exchange('priority', type='direct'), routing_key='gbl'),
Queue('queue_get_user_info', exchange=Exchange('priority', type='direct'), routing_key='gui'),
)
task_routes = ([
('get_battle_info', {'queue': 'queue_get_battle_info'}),
('get_battle_list', {'queue': 'queue_get_battle_list'}),
('get_user_info', {'queue': 'queue_get_user_info'}),
],)Copy the code
Then control the API request and Redis data in task to implement the complete task logic, such as:
@app.task(name='get_battle_list') def get_battle_list(openid, plat_id=None, after_time=0, update_time=None): Get (rediskeys.user_battle_list(openID)): return True if not plat_id: try: Us = handles. Get_user_info_handles (openID) plat_id=us['plat_id'] except Exception as e: print 'can not get user plat_id', openid, Traceback.format_exc () return False # Battle_list = handles. Get_battle_list_handle (openID, plat_id, After_time =0, update_time=None) # after_time in battle_list if not settings.DataRedis.get(rediskeys.user_battle(openid, room_id)): get_battle_info.delay(openid, plat_id, room_id) return TrueCopy the code
Began to crawl
Since we are divergent and crawler, we need to give the code a user entry, so we need to manually create a user collection task
from tasks.all import get_battle_list
my_openid = 'oODfo0oIErZI2xxx9xPlVyQbRPgY'
my_platid = '0'
get_battle_list.delay(my_openid, my_platid, after_time=0, update_time=None)Copy the code
Once there are entrances we start worker with celery to start crawler
All worker -c 5 --queue=queue_get_user_info --loglevel=info -n get_user_info@%h # All worker -c 5 --queue=queue_get_battle_list --loglevel=info -n get_battle_list@%h # All worker -c 30 --queue=queue_get_battle_info --loglevel=info -n get_battle_info@%hCopy the code
So our reptile can run happily. The executive status was also checked by include-flower.
celery flower -A tasks.all --broker=redis://:$REDIS_PASS@$REDIS_HOST:$REDIS_PORT/10Copy the code
With Flower, we can see that the efficiency of the operation is still very good.During the execution process, get_battle_list will run too fast, resulting in a backlog of get_battle_info even after 30 concurrent sessions. Therefore, it is necessary to stop these workers timely. We can stop after we catch 200,000 messages.
Step 3 Data analysis
Analysis of plan
The data of 200,000 battles has been captured, all split into JSON files and stored on my local disk. Next, I will do some simple analysis. Python is also very useful for data analysis. There are many libraries in python, such as Pandas and NumPy, that I have not studied, and I wrote a very simple program to do some shallow analysis. If you need in-depth analysis and don’t want to crawl your own, you can contact me to package the data.
# coding=utf-8
import os
import json
import datetime
import math
from conf import settings
class UserTeamTypeData:
def __init__(self, team_type, player_count):
self.team_type = team_type
self.player_count = player_count
self.label = {}
self.dead_poison_circle_no = {}
self.count = 0
self.damage = 0
self.survival_duration = 0 # 生存时间
self.driving_distance = 0
self.forward_distance = 0
self.times_assist = 0 # 助攻
self.times_head_shot = 0
self.times_kill = 0
self.times_reborn = 0 # 被救次数
self.times_save = 0 # 救人次数
self.top_kill_distance = []
self.top_kill_distance_weapon_use = {}
self.vehicle_kill = 0 # 车辆杀死
self.award_gold = 0
self.times_reborn_by_role_sex = {0: 0, 1: 0} # 0 男 1 女
self.times_save_by_role_sex = {0: 0, 1: 0} # 0 男 1 女
def update_dead_poison_circle_no(self, dead_poison_circle_no):
if dead_poison_circle_no in self.dead_poison_circle_no:
self.dead_poison_circle_no[dead_poison_circle_no] += 1
else:
self.dead_poison_circle_no[dead_poison_circle_no] = 1
def update_times_reborn_and_save_by_role_sex(self, role, times_reborn, times_save):
if role not in self.times_reborn_by_role_sex:
return
self.times_reborn_by_role_sex[role] += times_reborn
self.times_save_by_role_sex[role] += times_save
def update_top_kill_distance_weapon_use(self, weaponid):
if weaponid not in self.top_kill_distance_weapon_use:
self.top_kill_distance_weapon_use[weaponid] = 1
else:
self.top_kill_distance_weapon_use[weaponid] += 1
class UserBattleData:
def __init__(self, openid):
self.openid = openid
self.team_type_res = {}
self.label = {}
self.hour_counter = {}
self.weekday_counter = {}
self.usetime = 0
self.day_record = set()
self.battle_counter = 0
def get_avg_use_time_per_day(self):
# print "get_avg_use_time_per_day:", self.openid, self.usetime, len(self.day_record), self.usetime / len(self.day_record)
return self.usetime / len(self.day_record)
def update_label(self, lable):
if lable in self.label:
self.label[lable] += 1
else:
self.label[lable] = 1
def get_team_type_data(self, team_type, player_count):
player_count = int(math.ceil(float(player_count) / 10))
team_type_key = '%d_%d' % (team_type, player_count)
if team_type_key not in self.team_type_res:
userteamtypedata = UserTeamTypeData(team_type, player_count)
self.team_type_res[team_type_key] = userteamtypedata
else:
userteamtypedata = self.team_type_res[team_type_key]
return userteamtypedata
def update_user_time_property(self, dt_event_time):
dt_event_time = datetime.datetime.fromtimestamp(dt_event_time)
hour = dt_event_time.hour
if hour in self.hour_counter:
self.hour_counter[hour] += 1
else:
self.hour_counter[hour] = 1
weekday = dt_event_time.weekday()
if weekday in self.weekday_counter:
self.weekday_counter[weekday] += 1
else:
self.weekday_counter[weekday] = 1
self.day_record.add(dt_event_time.date())
def update_battle_info_by_room(self, roomid):
# print ' load ', self.openid, roomid
file = os.path.join(settings.Res_UserBattleInfo_Dir, self.openid, '%s.txt' % roomid)
with open(file, 'r') as rf:
battledata = json.load(rf)
self.battle_counter += 1
base_info = battledata['base_info']
self.update_user_time_property(base_info['dt_event_time'])
battle_info = battledata['battle_info']
userteamtypedata = self.get_team_type_data(base_info['team_type'], base_info['player_count'])
userteamtypedata.count += 1
userteamtypedata.award_gold += battle_info['award_gold']
userteamtypedata.damage += battle_info['damage']
userteamtypedata.update_dead_poison_circle_no(battle_info['dead_poison_circle_no'])
userteamtypedata.driving_distance += battle_info['driving_distance']
userteamtypedata.forward_distance += battle_info['forward_distance']
self.update_label(battle_info['label'])
userteamtypedata.survival_duration += battle_info['survival_duration']
self.usetime += battle_info['survival_duration']/60
userteamtypedata.times_assist += battle_info['times_assist']
userteamtypedata.times_head_shot += battle_info['times_head_shot']
userteamtypedata.times_kill += battle_info['times_kill']
userteamtypedata.times_reborn += battle_info['times_reborn']
userteamtypedata.times_save += battle_info['times_save']
userteamtypedata.damage += battle_info['damage']
userteamtypedata.top_kill_distance.append(battle_info['top_kill_distance'])
userteamtypedata.update_times_reborn_and_save_by_role_sex(base_info['role_sex'], battle_info['times_reborn'],
battle_info['times_save'])
def get_user_battleinfo_rooms(self):
user_dir = os.path.join(settings.Res_UserBattleInfo_Dir, self.openid)
r = [room for room in os.listdir(user_dir)]
r = [rr.replace('.txt', '') for rr in r]
return r
class AllUserCounter:
def __init__(self):
self.hour_counter = {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 0, 18: 0, 19: 0, 20: 0, 21: 0, 22: 0, 23: 0}
self.weekday_counter = {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0}
self.times_reborn_by_role_sex = {0: 0, 1: 0} # 0 男 1 女
self.times_save_by_role_sex = {0: 0, 1: 0} # 0 男 1 女
self.user_count = 0
self.battle_count = 0
self.every_user_use_time_per_day = []
self.top_kill_distance = 0
def avg_use_time(self):
return sum(self.every_user_use_time_per_day) / len(self.every_user_use_time_per_day)
def add_user_data(self, userbattledata):
self.every_user_use_time_per_day.append(userbattledata.get_avg_use_time_per_day())
self.battle_count += userbattledata.battle_counter
self.user_count += 1
for k in userbattledata.hour_counter:
if k in self.hour_counter:
self.hour_counter[k] += userbattledata.hour_counter[k]
else:
self.hour_counter[k] = userbattledata.hour_counter[k]
for weekday in userbattledata.weekday_counter:
if weekday in self.weekday_counter:
self.weekday_counter[weekday] += userbattledata.weekday_counter[weekday]
else:
self.weekday_counter[weekday] = userbattledata.weekday_counter[weekday]
for userteamtype in userbattledata.team_type_res:
userteamtypedata = userbattledata.team_type_res[userteamtype]
for k in userteamtypedata.times_reborn_by_role_sex:
self.times_reborn_by_role_sex[k] += userteamtypedata.times_reborn_by_role_sex[k]
for k in userteamtypedata.times_save_by_role_sex:
self.times_save_by_role_sex[k] += userteamtypedata.times_save_by_role_sex[k]
if userteamtypedata.top_kill_distance > self.top_kill_distance:
self.top_kill_distance = userteamtypedata.top_kill_distance
def __str__(self):
res = []
res.append('总用户数\t%d' % self.user_count)
res.append('总战斗数\t%d' % self.battle_count)
res.append('平均日耗时\t%d' % self.avg_use_time())
res.append('最远击杀\t%d' % max(self.top_kill_distance))
res.append('男性角色\t被救%d次\t救人%d次' % (self.times_reborn_by_role_sex[0], self.times_save_by_role_sex[0]))
res.append('女性角色\t被救%d次\t救人%d次' % (self.times_reborn_by_role_sex[1], self.times_save_by_role_sex[1]))
res.append('小时分布')
for hour in range(0, 24):
# res.append('\t%d: %d' % (hour, self.hour_counter[hour]))
res.append('\t%d: %d %.2f%%' % (hour, self.hour_counter[hour], self.hour_counter[hour]/float(self.battle_count)*100))
res.append('星期分布')
# res.append(self.weekday_counter.__str__())
for weekday in range(0, 7):
res.append('\t%d: %d %.2f%%' % (weekday+1, self.weekday_counter[weekday], (self.weekday_counter[weekday]/float(self.battle_count)*100)))
return '\n'.join(res)
def get_user_battleinfo_rooms(openid):
user_dir = os.path.join(settings.Res_UserBattleInfo_Dir, openid)
# files = os.listdir(user_dir)
r = [room for room in os.listdir(user_dir)]
r = [rr.replace('.txt', '') for rr in r]
return r
if __name__ == '__main__':
alluserconter = AllUserCounter()
folders = os.listdir(settings.Res_UserBattleInfo_Dir)
i = 0
for folder in folders:
i+=1
print i, '/' , len(folders), folder
userbattledata = UserBattleData(folder)
for room in userbattledata.get_user_battleinfo_rooms():
userbattledata.update_battle_info_by_room(room)
alluserconter.add_user_data(userbattledata)
print "\n" * 3
print "---------------------------------------"
print alluserconterCopy the code
The results of the analysis
1. The average online duration of a user is 2 hours per day
According to the distribution map, most users spend more than 1 hour, and the most aggressive users spend more than 8 hours.
Note: I calculated the duration of each game, the actual online duration is longer than mine.
2. Female characters are saved more often than male characters
Finally know why there are so many people demon, the original inside the game can take advantage of ah.
3. Female characters save lives more often than men
It gives you a good reason to take your girl points.
4. Fridays are busiest
I guess Friday will be busy with assignments and weekly reports.
5. Peak gaming at 22 p.m
There are so many people playing in the wee hours. Don’t you sleep?
6. Maximum kill distance is 639 meters
I looked at the range of 98K, SKS and AWP, and they were all roughly within 800 meters, so it was a good value. On the other hand, those long range kills on Tiktok should have been staged.
7. “Heal the wounded and save the dying” is the highest honor
As you can see from the distribution, healing the wounded and saving the dying is more difficult than ten kills.
Most of the characters who were awarded lifesaving titles were female, proving once again that you need to bring a girl to the game. Getting back to the essence of the game, it’s a survival game, and there’s nothing more important than surviving.
At the end
This crawler mainly extracted so much data by taking advantage of the scene that wechat game channel can view strangers’ data. We can use the same method to analyze the data of King of Glory and other games, so if you are interested, you can try it out. Finally, the UMP9 is a great gun, very cool with a 2x lens.