The original

It has been about a month since I did my daily research at Respage01, during which I was confused about the value of daily research and the lack of spare time. But confusion is confusion, or the more confusion is to try to understand the world. The goal of RESPage02 is to study the flow of people in small areas. The flow of people in a small area is one of the information I always want to get, and the “situation” here is the biggest information we can get. As people travel in more and more diversified ways, I think it is difficult to fully describe the flow of people except for the government to install monitoring at the intersection. What we can do is to reflect this “situation” in a partial way through some or some ways of travel.

imagination

My imagination about one-sided control of abortion situation has several:

  • The distribution data of shared bikes were collected to reflect the movement of people in a small area.
  • The real-time distribution data of online car hailing are collected to reflect the flow of people and traffic conditions in the city.
  • A high-definition camera captures video from a specific Angle of a street, and identifies movement data to reflect a single point of flow.
  • Proactively collect mobile device data through wifi sniffer.

The meaning of imagination is to let you have time to study, I just chose the most convenient data of shared bikes, because the shared bikes move slowly, without the pursuit of climbing speed, it is possible to obtain research results.

Interface to get data

About access to Shared cycling (mo) articles have a lot of data, the main idea is through the study of API, bicycle WeChat applet simply caught can find rules, the following interface can bike around a position information, the POI data and access to baidu map, try to increase the patch, can get complete data as far as possible, Because according to observation, the number of nearby bicycle interfaces is limited:

URL = "https://mwx.mobike.com/nearby/nearbyBikeInfo?biketype=0" + \
      "&latitude=" + lat + \
      "&longitude=" + lng + \
      "&userid=" + userId + \
      "&citycode=0579"
Copy the code

Acquisition window

Using the same rectangular window scan as the POI data, to speed things up a bit, I simply added a process to collect at the same time. To reduce the blocking risk, I also prepared two userids and two scan areas. (Advanced collection, multi-process or coroutine + proxy for faster and more complete collection, but I don’t need it at present)

## Jiangnan two rectangular areas
BigRect1 = {
    'left': {
        'x': 119.634998.'y': 29.046372
    },
    'right': {
        'x': 119.6727628.'y': 29.077628
    }
}

BigRect2 = {
    'left': {
        'x': 119.628268.'y': 29.072232
    },
    'right': {
        'x': 119.67208.'y': 29.098397}}Copy the code

Complete collection code

Roubospiker: some crawlers implement a work method and start two collection processes: (note that log files should be created separately to avoid concurrent writing)


def worker(bigrect, userId, FileKey):
    today = time.strftime("%Y_%m_%d_%H")
    for count in range(0.10):
        logfile = open("./log/" + FileKey + "-" + str(count) + '_' + today + ".log".'a+', encoding='utf-8')
        file = open("./result/" + FileKey + "-" + str(count) + '_' + today + ".txt".'a+', encoding='utf-8')
        for index in range(int(WindowSize['xNum'] * WindowSize['yNum')) : lng, lat = getSmallRect(bigrect, WindowSize, index) requestMBikeApi(lat=lat, lng=lng, index=index, file=file, logfile=logfile, userId=userId) time.sleep(1200)

def main(a):
    userIds = tool.getMBikeUserID()
    p1 = multiprocessing.Process(target=worker, name='p1', args=(BigRect1, userIds[0].'shareBike01'))
    p2 = multiprocessing.Process(target=worker, name='p2', args=(BigRect2, userIds[1].'shareBike02'))
    p1.start()
    p2.start()
Copy the code

The end of the first

The process of collection + storage + display will be implemented in the next draft.