Recently, I have been reading the book Python Data Analysis and taking notes.

The work environment

Edm and IPython are recommended as data analysis environments in this book. I’m just starting to use this integrated environment and find it much more interactive than the traditional command line approach.

Method of use

#edm shell(edm) bash - 3.2 $ipython Python 2.7.13 | Enthought, Inc. (x86_64) | (default, Mar 2, 2017, 08:20:50) Type "copyright", IPython 5.3.0 -- An enhanced Interactive Python "Credits" or "license" for more information.Copy the code

An example of demographic data

I downloaded the population data of the United States from github website, and followed the code in the book to go to Pivot_table. After checking with help, MY version has been updated, and I can run after modifying it.

In [7]: import pandas as pd In [8]: names1880 = pd.read_csv('yob1880.txt',names=['name','sex','births']) In [9]: names1880 Out[9]: name sex births 0 Mary F 7065 1 Anna F 2604 2 Emma F 2003 3 Elizabeth F 1939 4 Minnie F 1746 5 Margaret F 1578 ... 1998 York M 5 1999 Zachariah M 5 [2000 rows x 3 columns] In [10]: names1880.groupby('sex').births.sum() Out[10]: Holocene (1, 1) : 1-5 (1, 1) In [12]: Years = range(1, 1) In [13]: pieces=[] In [14]: columns=['name','sex','births'] In [15]: for year in years: ... : path='yob%d.txt' % year ... : frame=pd.read_csv(path,names=columns) ... : frame['year']=year ... : pieces.append(frame) ... : In [16]: names=pd.concat(pieces,ignore_index=True) In [17]: names Out[17]: name sex births year 0 Mary F 7065 1880 1 Anna F 2604 1880 2 Emma F 2003 1880 3 Elizabeth F 1939 1880 4 Minnie F 1746 1880 1690781 Zyquarius M 5 2010 1690782 Zyran M 5 2010 1690783 Zzyzx M 5 2010 [1690784 rows x 4 columns] In [25]: total_birth=names.pivot_table('births',index ... : ='year',columns='sex',aggfunc=sum) In [26]: total_birth.tail() Out[26]: sex F M year 2006 1896468 2050234 2007 1916888 2069242 2008 1883645 2032310 2009 1827643 1973359 2010 1759010 1898382 In  [27]: total_birth.plot(title="Total births by sex and year") Out[27]: <matplotlib.axes._subplots.AxesSubplot at 0x11864af50> In [31]: import matplotlib.pyplot as plt In [32]: plt.show()Copy the code

This article is the author’s original, if you think this article is helpful to you, please feel free to tip, your support will encourage me to continue to create.

1, EDM 2, PyData 3, Matplotlib