preface
Use Python to crawl and simply analyze a-share company data. Let’s get started
The development tools
Python version: 3.6.4
Related modules:
Requests module;
Bs4 module;
LXML module;
Pyecharts module;
Wordcloud module;
Jieba module;
And some modules that come with Python.
Environment set up
Install Python and add it to the environment variables. PIP installs the required related modules.
Data crawl
Target Website:
http://www.askci.com/reports/
Copy the code
The data to be crawled is as follows:
Needless to think, BeautifulSoup extracts these data directly. The source code is as follows:
For complete source code, see the spider. py file on your home page or in a private file.
The screenshot of the operation effect is as follows:
All done~
The data analysis
In the data crawl part, we obtained A total of 3573 A share company data, let’s simply analyze A wave of visualization
First, let’s take A look at the regional distribution of A-share companies:
Among them, there are more than 300 Provinces with A-share companies:
-
guangdong
-
Beijing
-
zhejiang
-
jiangsu
Let’s take A look at the revenue of a-share companies:
The TOP10 main business revenues are:
Take A look at the number of employees in a-share companies:
Let’s take A look at the listing time distribution of A-share companies:
Among them, the number of listed companies in 2013 was the least (2); The largest number of companies went public in 2017 (438).
OK, let’s take A look at the TOP10 industry types of a-share companies:
Emmmm, very real.
Finally, we draw the a-share company’s main business into A word cloud.
I share Python data crawler cases every day. The next article is to share Python simple analysis of college entrance examination data
All done! All of the source code referred to in this section is in the analysis.py file, either in a profile or in a private message related file.