The main idea
Objective:
According to the input city name, climb the city meituan food plate all the business data. The data include:
Store name, rating, number of reviews, average price, address,
And put that data into Excel.
Finally, we try to do a simple analysis on the data obtained by crawling.
Overcome anti-crawlers:
After climbing each page of data, randomly stop for a period of time and then climb the next page;
Each page uses a different cookie value.
Specific principles:
Open Chrome and take a look at XHR…
Found a direct interface available.
Detailed implementation process in the profile to obtain the source code.
The development tools
Python version: 3.5.4
Related modules:
Requests module;
Win_unicode_console module;
Openpyxl module;
And some modules that come with Python.
Environment set up
Install Python and add it to the environment variables. PIP installs the required related modules.
Using the demonstration
Run the mt_cate_spider. py file in a CMD window.
Simple analysis
In fact, I temporarily added this part in the code word, the reason is very simple, I want to emphasize the importance of crawler and data analysis.
Use the data analysis function of Excel to simply analyze a wave
Data from the Shanghai area.
The first is, of course, in order of the score, and then make a bar chart:
Then the number of comments in order to make a bar chart:
Then make some other interesting patterns:
To help those of you who are slow to learn Python, here is a rich learning package for you
OK, That ‘s all!