First, shengun Town building

background

Recently the boss fell in love with eating chicken (mobile games: all forces attack), often pull us open black, can only give up the lunch break time, accompany the boss in the desert. Last week, when I was watching achievements in the game channel of wechat, I had an idea that I could capture a lot of battle data in this way, and then analyze to see what rules there are.

Show a wave of achievements, under the condition of black, our team ate chicken very high rate, nearly 100 times of eating chicken 51 times

We evaluated it briefly, decided it would work, and let’s go.

Step 1 Analyze the data interface

The first step, of course, is to collect these stats. First we need to understand the story behind the page. Check out how the page gets combat data.

Use Charles to capture packets

Caught implementation

In Mac, it is recommended to use Charles to capture the traffic on the mobile phone from the protocol layer. The principle is to open a proxy server on the Mac, and then set the network proxy of the mobile phone to Mac, so that all the traffic on the mobile phone will pass through our proxy server. The general process is as follows:

Processing HTTPS encrypted traffic

In the actual operation, we found that all traffic of wechat went through HTTPS, resulting in the encrypted data we caught, which has no reference significance for us. After research, you can install Charles root certificate on both mobile phones and computers to analyze Https traffic. For details, please refer to:

  • Charles HTTPS packet capture for MAC and iPhone
  • Charles failed to capture Https requests in iOS 11

After installing the certificate, our traffic looks something like thisAfter the above configuration, we can read the HTTPS request and response data, as shown in the figure below.

  • You can do the same thing with Findler on Windows
  • In fact, this is a very typical man in the middle scenario

Data interface

Next, we will find out the interfaces we need according to these data. After analysis, three interfaces are mainly involved

  • Interface to get user information
  • Get user record list interface
  • Interface for obtaining details of user – specified battles

Let’s look at them one by one

1. Interface for obtaining user information

  • request
API /cgi-bin/gamewap/getpubgmbattlelist
methods GET
parameter Openid, pass_ticket, plat_id, after_time, and limit
cookie Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID
  • response
{ "user_info": { "openid": "oODfo0pjBQkcNuR4XLTQ321xFVws", "head_img_url": "Http://wx.qlogo.cn/mmhead/Q3auHgzwzM5hSWxxxxxUQPwW9ibxxxx9DlxLTsKWk97oWpDI0rg/96", "nick_name" : "hope", "role_name" : "xxxx", "zone_area_id": 0, "plat_id": 1 }, "battle_info": { "total_1": 75, "total_10": 336, "total_game": 745, "total_kill": 1669 }, "battle_list": [{ "map_id": 1, "room_id": "6575389198189071197", "team_id": 57, "dT_event_time ": 1530953799," rank_IN_ds ": 3, "times_kill": 1, "label": "top five ", "team_type": 1, "award_gold": 677, "mode": 0 }], "appitem": { "AppID": "wx13051697527efc45", "IconURL": "https://mmocgame.qpic.cn/wechatgame/mEMdfrX5RU0dZFfNEdCsMJpfsof1HE0TP3cfZiboX0ZPxqh5aZnHjxPFXUGgsXmibe/0", "Name": "Desperately Army attack ", "BriefName" : "desperately Army attack ", "Desc", "the official licensed mobile game desperately," and "Brief" : "gun | 808.2 M", "WebURL" : "https://game.weixin.qq.com/cgi-bin/h5/static/detail_v2/index.html?wechat_pkgid=detail_v2&appid=wx13051697527efc45&show_ bubble=0", "DownloadInfo": { "DownloadURL": "https://itunes.apple.com/cn/app/id1304987143", "DownloadFlag": 5 }, "Status": 0, "AppInfoFlag": 45, "Label": [], "AppStorePopUpDialogConfig": { "Duration": 1500, "Interval": 172800, "ServerTimestamp": 1531066098 }, "HasEnabledChatGroup": false, "AppType": 0, "game_tag_list": [" jedi "to survive," the original reduction ", "friends open black", "one hundred people against", "large map]", "recommend_reason" : "the original desperately, wilderness shooting," "size_desc" : "808.2 M"}, "is_guest" : true, "is_blocked": false, "errcode": 0, "errmsg": "ok" }Copy the code

2. Obtain the user record list interface

  • Analysis of the

The OpenID is the unique identity of the user.

2. Obtain the user record list interface

  • request


    API /cgi-bin/gamewap/getpubgmbattlelist
    methods GET
    parameter Openid, pass_ticket, plat_id, after_time, and limit
    cookie Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID
  • response
{ "errcode": 0, "errmsg": "ok", "next_after_time": 1528120556, "battle_list": [{ "map_id": 1, "room_id": "6575389198111172597", "team_id": 57, "dt_event_time": 1530953799, "rank_in_ds": 3, "times_kill": 1, "label": "Top five", "team_type" : 1, "award_gold" : 677, the "mode" : 0}, {" map_id ": 1," room_id ":" 6575336498940384115 ", "team_id" : 11, "dT_event_time ": 1530941404," rank_IN_DS ": 5, "times_kill": 2, "label": "top five ", "team_type": 1, "award_gold": 632, "mode": 0 }], "has_next": true }Copy the code

  • Analysis of the
  • This interface uses after_time for pagination and traversal to determine whether there is data for the next page based on has_next and next_AFTER_time of the interface response.
  • The Room_id in the list is the unique id for each battle.

3. Interface for obtaining user record details

  • request
API /cgi-bin/gamewap/getpubgmbattledetail
methods GET
parameter Openid, pass_ticket, and room_id
cookie Key pass_ticket, UIN, pgv_pvid, sd_cookie_crttime, and sd_userID
  • request
{" errcode ": 0," errmsg ":" ok ", "base_info" : {" nick_name ":" pomelo tea ", "head_img_url" : "http://wx.qlogo.cn/mmhead/xxxx/96", "dt_event_time": 1528648165, "team_type": 4, "rank": 1, "player_count": 100, "role_sex" : 1, "label" : "good luck", "the openid" : "oODfo0s1w5lWjmxxxxxgQkcCljXQ"}, "battle_info" : {" award_gold ": 622, "times_kill": 6, "times_head_shot": 0, "damage": 537, "times_assist": 3, "survival_duration": 1629, "times_save": 0, "times_reborn": 0, "vehicle_kill": 1, "forward_distance": 10140, "driving_distance": 5934, "dead_poison_circle_no": 6, "top_kill_distance": 223, "top_kill_distance_weapon_use": 2924130819, "be_kill_user": { "nick_name": "Asahi ", "head_img_url": "http://wx.qlogo.cn/mmhead/ibLButGMnqJNFsUtStNEV8tzlH1QpwPiaF9kxxxxx66G3ibjic6Ng2Rcg/96", "weapon_use": 20101000001, "the openid" : "oODfo0qrPLExxxxc0QKjFPnPxyI}", "label" : "prosperous"}, "team_info" : {" user_list ": [{" nick_name" : "ooo", "times_kill": 6, "assist_count": 3, "survival_duration": 1638, "award_gold": 632, "head_img_url": "http://wx.qlogo.cn/mmhead/Q3auHgzwzM4k4RXdyxavNxxxxUjcX6Tl47MNNV1dZDliazRKRg", "openid": ["nick_name": "I eat fried meat ", "times_kill": 2, "assist_count": 2, "survival_duration": 1502, "award_gold": 583, "head_img_url": "http://wx.qlogo.cn/mmhead/sTJptKvBQLKd5SAAjOF0VrwiapUxxxxFffxoDUcrVjYbDf9pNENQ", "openid": "oODfo0gIyDxxxxZpUrSrpapZSDT0" }] }, "is_guest": true, "is_blocked": false }Copy the code

  • Analysis of the
  • This interface responds to detailed combat information, including kills, headshots, saves, distance run, etc., enough for us to analyze.
  • The interface also responds to the openID of the group member and who killed it, allowing us to crawl more user data with infinite depth of spread.

As for information such as pass_ticket in cookie, it must be used for permission authentication. These information has not changed in the above several requests, so we do not need to study how to calculate it deeply. We only need to capture the packet and extract the default information and fill it into the code to use it.

Step 2 Crawl data

Now that the interface has been identified, it’s time to grab enough data.

Get the data using the Requests request interface

url = 'https://game.weixin.qq.com/cgi-bin/gamewap/getpubgmdatacenterindex?openid=%s&plat_id=0&uin=&key=&pass_ticket=%s' % (openid, settings.pass_ticket) r = requests.get(url=url, cookies=settings.def_cookies, headers=settings.def_headers, Timeout =(5.0, 5.0)) TMP = r.json() wfile = os.path.join(settings.res_userInfo_dir, '%s.txt' % (rediskeys.user(openid))) with codecs.open(wfile, 'w', 'utf-8') as wf: wf.write(simplejson.dumps(tmp, indent=2, sort_keys=True, ensure_ascii=False))Copy the code

This way we can quickly write the other two interfaces.

Use Redis to mark information that has been crawled

In the above interface, we may find the OpenID of user B from the entrance of user A, and then find the OpenID of user A from the entrance of user B. In order to avoid repeated collection, we need to record which information we have collected. Core snippet

Def user_battle_list(openID) def user_battle_list(openID): Return 'ubl_%s' % (openID) # If settings.dataredis. Get (rediskeys.user_battle_list(openID))) Set (rediskeys.user_battle_list(openID), 1)Copy the code

Use celery to manage queues

Celery is a very useful distributed queue management tool, I’m only going to run celery on my own computer this time so I’m not using the distributed feature. We create three tasks and three queues

task_queues = (
    Queue('queue_get_battle_info', exchange=Exchange('priority', type='direct'), routing_key='gbi'),
    Queue('queue_get_battle_list', exchange=Exchange('priority', type='direct'), routing_key='gbl'),
    Queue('queue_get_user_info', exchange=Exchange('priority', type='direct'), routing_key='gui'),
)
 
task_routes = ([
    ('get_battle_info', {'queue': 'queue_get_battle_info'}),
    ('get_battle_list', {'queue': 'queue_get_battle_list'}),
    ('get_user_info', {'queue': 'queue_get_user_info'}),
],)Copy the code

Then control the API request and Redis data in task to implement the complete task logic, such as:

@app.task(name='get_battle_list') def get_battle_list(openid, plat_id=None, after_time=0, update_time=None): Get (rediskeys.user_battle_list(openID)): return True if not plat_id: try: Us = handles. Get_user_info_handles (openID) plat_id=us['plat_id'] except Exception as e: print 'can not get user plat_id', openid, Traceback.format_exc () return False # Battle_list = handles. Get_battle_list_handle (openID, plat_id, After_time =0, update_time=None) # after_time in battle_list if not settings.DataRedis.get(rediskeys.user_battle(openid, room_id)): get_battle_info.delay(openid, plat_id, room_id) return TrueCopy the code

Began to crawl

Since we are divergent and crawler, we need to give the code a user entry, so we need to manually create a user collection task

from tasks.all import get_battle_list
 
my_openid = 'oODfo0oIErZI2xxx9xPlVyQbRPgY'
my_platid = '0'
 
get_battle_list.delay(my_openid, my_platid, after_time=0, update_time=None)Copy the code

Once there are entrances we start worker with celery to start crawler

All worker -c 5 --queue=queue_get_user_info --loglevel=info -n get_user_info@%h # All worker -c 5 --queue=queue_get_battle_list --loglevel=info -n get_battle_list@%h # All worker -c 30 --queue=queue_get_battle_info --loglevel=info -n get_battle_info@%hCopy the code

So our reptile can run happily. The executive status was also checked by include-flower.

celery flower -A tasks.all --broker=redis://:$REDIS_PASS@$REDIS_HOST:$REDIS_PORT/10Copy the code

With Flower, we can see that the efficiency of the operation is still very good.During the execution process, get_battle_list will run too fast, resulting in a backlog of get_battle_info even after 30 concurrent sessions. Therefore, it is necessary to stop these workers timely. We can stop after we catch 200,000 messages.

Step 3 Data analysis

Analysis of plan

The data of 200,000 battles has been captured, all split into JSON files and stored on my local disk. Next, I will do some simple analysis. Python is also very useful for data analysis. There are many libraries in python, such as Pandas and NumPy, that I have not studied, and I wrote a very simple program to do some shallow analysis. If you need in-depth analysis and don’t want to crawl your own, you can contact me to package the data.

# coding=utf-8
import os
import json
import datetime
import math
 
from conf import settings
 
 
class UserTeamTypeData:
    def __init__(self, team_type, player_count):
        self.team_type = team_type
        self.player_count = player_count
        self.label = {}
        self.dead_poison_circle_no = {}
        self.count = 0
        self.damage = 0
        self.survival_duration = 0  # 生存时间
        self.driving_distance = 0
        self.forward_distance = 0
        self.times_assist = 0  # 助攻
        self.times_head_shot = 0
        self.times_kill = 0
        self.times_reborn = 0  # 被救次数
        self.times_save = 0  # 救人次数
        self.top_kill_distance = []
        self.top_kill_distance_weapon_use = {}
        self.vehicle_kill = 0  # 车辆杀死
        self.award_gold = 0
        self.times_reborn_by_role_sex = {0: 0, 1: 0}  # 0 男 1 女
        self.times_save_by_role_sex = {0: 0, 1: 0}  # 0 男 1 女
 
    def update_dead_poison_circle_no(self, dead_poison_circle_no):
        if dead_poison_circle_no in self.dead_poison_circle_no:
            self.dead_poison_circle_no[dead_poison_circle_no] += 1
        else:
            self.dead_poison_circle_no[dead_poison_circle_no] = 1
 
    def update_times_reborn_and_save_by_role_sex(self, role, times_reborn, times_save):
        if role not in self.times_reborn_by_role_sex:
            return
 
        self.times_reborn_by_role_sex[role] += times_reborn
        self.times_save_by_role_sex[role] += times_save
 
    def update_top_kill_distance_weapon_use(self, weaponid):
        if weaponid not in self.top_kill_distance_weapon_use:
            self.top_kill_distance_weapon_use[weaponid] = 1
        else:
            self.top_kill_distance_weapon_use[weaponid] += 1
 
 
class UserBattleData:
 
    def __init__(self, openid):
        self.openid = openid
        self.team_type_res = {}
        self.label = {}
        self.hour_counter = {}
        self.weekday_counter = {}
        self.usetime = 0
        self.day_record = set()
        self.battle_counter = 0
 
    def get_avg_use_time_per_day(self):
        # print "get_avg_use_time_per_day:", self.openid, self.usetime, len(self.day_record), self.usetime / len(self.day_record)
        return self.usetime / len(self.day_record)
 
    def update_label(self, lable):
        if lable in self.label:
            self.label[lable] += 1
        else:
            self.label[lable] = 1
 
    def get_team_type_data(self, team_type, player_count):
        player_count = int(math.ceil(float(player_count) / 10))
        team_type_key = '%d_%d' % (team_type, player_count)
 
        if team_type_key not in self.team_type_res:
            userteamtypedata = UserTeamTypeData(team_type, player_count)
            self.team_type_res[team_type_key] = userteamtypedata
        else:
            userteamtypedata = self.team_type_res[team_type_key]
 
        return userteamtypedata
 
    def update_user_time_property(self, dt_event_time):
        dt_event_time = datetime.datetime.fromtimestamp(dt_event_time)
        hour = dt_event_time.hour
        if hour in self.hour_counter:
            self.hour_counter[hour] += 1
        else:
            self.hour_counter[hour] = 1
 
        weekday = dt_event_time.weekday()
        if weekday in self.weekday_counter:
            self.weekday_counter[weekday] += 1
        else:
            self.weekday_counter[weekday] = 1
 
        self.day_record.add(dt_event_time.date())
 
    def update_battle_info_by_room(self, roomid):
        # print '  load ', self.openid, roomid
        file = os.path.join(settings.Res_UserBattleInfo_Dir, self.openid, '%s.txt' % roomid)
 
        with open(file, 'r') as rf:
            battledata = json.load(rf)
 
        self.battle_counter += 1
        base_info = battledata['base_info']
        self.update_user_time_property(base_info['dt_event_time'])
        battle_info = battledata['battle_info']
 
        userteamtypedata = self.get_team_type_data(base_info['team_type'], base_info['player_count'])
        userteamtypedata.count += 1
        userteamtypedata.award_gold += battle_info['award_gold']
        userteamtypedata.damage += battle_info['damage']
        userteamtypedata.update_dead_poison_circle_no(battle_info['dead_poison_circle_no'])
        userteamtypedata.driving_distance += battle_info['driving_distance']
        userteamtypedata.forward_distance += battle_info['forward_distance']
        self.update_label(battle_info['label'])
        userteamtypedata.survival_duration += battle_info['survival_duration']
        self.usetime += battle_info['survival_duration']/60
        userteamtypedata.times_assist += battle_info['times_assist']
        userteamtypedata.times_head_shot += battle_info['times_head_shot']
        userteamtypedata.times_kill += battle_info['times_kill']
        userteamtypedata.times_reborn += battle_info['times_reborn']
        userteamtypedata.times_save += battle_info['times_save']
        userteamtypedata.damage += battle_info['damage']
        userteamtypedata.top_kill_distance.append(battle_info['top_kill_distance'])
        userteamtypedata.update_times_reborn_and_save_by_role_sex(base_info['role_sex'], battle_info['times_reborn'],
                                                                  battle_info['times_save'])
 
    def get_user_battleinfo_rooms(self):
        user_dir = os.path.join(settings.Res_UserBattleInfo_Dir, self.openid)
        r = [room for room in os.listdir(user_dir)]
        r = [rr.replace('.txt', '') for rr in r]
        return r
 
class AllUserCounter:
 
    def __init__(self):
        self.hour_counter = {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0, 10: 0, 11: 0, 12: 0, 13: 0, 14: 0, 15: 0, 16: 0, 17: 0, 18: 0, 19: 0, 20: 0, 21: 0, 22: 0, 23: 0}
 
        self.weekday_counter = {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0}
        self.times_reborn_by_role_sex = {0: 0, 1: 0}  # 0 男 1 女
        self.times_save_by_role_sex = {0: 0, 1: 0}  # 0 男 1 女
        self.user_count = 0
        self.battle_count = 0
        self.every_user_use_time_per_day = []
        self.top_kill_distance = 0
 
    def avg_use_time(self):
        return sum(self.every_user_use_time_per_day) / len(self.every_user_use_time_per_day)
 
    def add_user_data(self, userbattledata):
        self.every_user_use_time_per_day.append(userbattledata.get_avg_use_time_per_day())
        self.battle_count += userbattledata.battle_counter
        self.user_count += 1
 
        for k in userbattledata.hour_counter:
            if k in self.hour_counter:
                self.hour_counter[k] += userbattledata.hour_counter[k]
            else:
                self.hour_counter[k] = userbattledata.hour_counter[k]
 
        for weekday in userbattledata.weekday_counter:
            if weekday in self.weekday_counter:
                self.weekday_counter[weekday] += userbattledata.weekday_counter[weekday]
            else:
                self.weekday_counter[weekday] = userbattledata.weekday_counter[weekday]
 
        for userteamtype in userbattledata.team_type_res:
            userteamtypedata = userbattledata.team_type_res[userteamtype]
            for k in userteamtypedata.times_reborn_by_role_sex:
                self.times_reborn_by_role_sex[k] += userteamtypedata.times_reborn_by_role_sex[k]
 
            for k in userteamtypedata.times_save_by_role_sex:
                self.times_save_by_role_sex[k] += userteamtypedata.times_save_by_role_sex[k]
 
            if userteamtypedata.top_kill_distance > self.top_kill_distance:
                self.top_kill_distance = userteamtypedata.top_kill_distance
 
    def __str__(self):
        res = []
        res.append('总用户数\t%d' % self.user_count)
        res.append('总战斗数\t%d' % self.battle_count)
        res.append('平均日耗时\t%d' % self.avg_use_time())
        res.append('最远击杀\t%d' % max(self.top_kill_distance))
        res.append('男性角色\t被救%d次\t救人%d次' % (self.times_reborn_by_role_sex[0], self.times_save_by_role_sex[0]))
        res.append('女性角色\t被救%d次\t救人%d次' % (self.times_reborn_by_role_sex[1], self.times_save_by_role_sex[1]))
 
        res.append('小时分布')
        for hour in range(0, 24):
            # res.append('\t%d: %d' % (hour, self.hour_counter[hour]))
            res.append('\t%d: %d %.2f%%' % (hour, self.hour_counter[hour], self.hour_counter[hour]/float(self.battle_count)*100))
        res.append('星期分布')
        # res.append(self.weekday_counter.__str__())
        for weekday in range(0, 7):
            res.append('\t%d: %d %.2f%%' % (weekday+1, self.weekday_counter[weekday], (self.weekday_counter[weekday]/float(self.battle_count)*100)))
 
        return '\n'.join(res)
 
 
def get_user_battleinfo_rooms(openid):
    user_dir = os.path.join(settings.Res_UserBattleInfo_Dir, openid)
 
    # files = os.listdir(user_dir)
    r = [room for room in os.listdir(user_dir)]
    r = [rr.replace('.txt', '') for rr in r]
    return r
 
 
if __name__ == '__main__':
    alluserconter = AllUserCounter()
    
    folders = os.listdir(settings.Res_UserBattleInfo_Dir)
    i = 0
    for folder in folders:
        i+=1
        print i, '/' , len(folders), folder
        userbattledata = UserBattleData(folder)
        for room in userbattledata.get_user_battleinfo_rooms():
            userbattledata.update_battle_info_by_room(room)
        alluserconter.add_user_data(userbattledata)
 
    print "\n" * 3
    print "---------------------------------------"
 
    print alluserconterCopy the code

The results of the analysis

1. The average online duration of a user is 2 hours per day

According to the distribution map, most users spend more than 1 hour, and the most aggressive users spend more than 8 hours.

Note: I calculated the duration of each game, the actual online duration is longer than mine.

2. Female characters are saved more often than male characters

Finally know why there are so many people demon, the original inside the game can take advantage of ah.

3. Female characters save lives more often than men

It gives you a good reason to take your girl points.

4. Fridays are busiest

I guess Friday will be busy with assignments and weekly reports.

5. Peak gaming at 22 p.m

There are so many people playing in the wee hours. Don’t you sleep?

6. Maximum kill distance is 639 meters

I looked at the range of 98K, SKS and AWP, and they were all roughly within 800 meters, so it was a good value. On the other hand, those long range kills on Tiktok should have been staged.

7. “Heal the wounded and save the dying” is the highest honor

As you can see from the distribution, healing the wounded and saving the dying is more difficult than ten kills.

Most of the characters who were awarded lifesaving titles were female, proving once again that you need to bring a girl to the game. Getting back to the essence of the game, it’s a survival game, and there’s nothing more important than surviving.

At the end

This crawler mainly extracted so much data by taking advantage of the scene that wechat game channel can view strangers’ data. We can use the same method to analyze the data of King of Glory and other games, so if you are interested, you can try it out. Finally, the UMP9 is a great gun, very cool with a 2x lens.

About the author:adenocarcinoma

Personal home page
My article