Today I’m going to show you how to generate cool dynamic interactive charts in a DataFrame dataset with one line of code, and we’re going to start by introducing the module CuffLinks which is like Seaborn wrapping matplotlib, Cufflinks has also made further packaging and optimization for Plotly, making it uniform, easy to configure parameters, and flexible for plotting DataFrame datasets
- The line chart
- Area chart
- A scatter diagram
- A histogram
- histogram
- Box figure
- Heat map
- 3D scatter chart /3D bubble chart
- Trend chart
- The pie chart
- K line graph
- Multiple subgraphs are combined
Module installation
When it comes to installation, simply PIP install
pip install cufflinks
Copy the code
Import the module and view its configuration
Let’s import this module to see what the current version is
cf.__version__
Copy the code
output
'0.17.3'
Copy the code
At present, the version of this module has reached 0.17.3, which is also the latest version. Then, what charts can be drawn in our latest version
cf.help()
Copy the code
output
Use 'cufflinks.help(figure)' to see the list of available parameters for the given figure.
Use 'DataFrame.iplot(kind=figure)' to plot the respective figure
Figures:
bar
box
bubble
bubble3d
candle
choroplet
distplot
.......
Copy the code
From the output above we can see that the general syntax for plotting a chart is df.iplot(kind= chart name). How do we want to view the parameters of a particular chart at the time of plotting, for example, what are the bar parameters
cf.help('bar')
Copy the code
A histogram
Let’s take a look at histogram charting and start by creating a data set for charting
Df2 = pd. DataFrame ({' Category: [' A ', 'B', 'C', 'D'], 'Values' :,56,70,85 [95]}) df2Copy the code
output
Category Values
0 A 95
1 B 56
2 C 70
3 D 85
Copy the code
And then we’ll plot the histogram
Df2. iplot(kind='bar',x='Category',y='Values', xTitle = "Category",yTitle = "Values", title = "histogram ") df2.iplot(kind='bar',x='Category',y='Values', xTitle = "Category", title = "Values", title =" histogram ")Copy the code
output
One of thex
The parameter is filled withx
The name of the variable above the axis, andy
The parameter isy
The corresponding variable name above the axis, we can draw the graph topng
And download it,
We can also zoom in on the diagram,
Let’s take a look at the following data set
Df.dataframe (np.random. Randn (100,4),columns='A B C D'.split()) df.head()Copy the code
output
A B C D 0 1.167609 1.528045-0.498168-0.221060 2-1.338883-0.732692 0.935410 1 1.167609 1.528045-0.498168-0.221060 2-1.338883-0.732692 0.935410 0.338740 3 1.662209 0.269750-1.026117-0.858472 4 1.387077-0.839192-0.562382-0.989672Copy the code
Let’s draw a diagram of a histogram
df.head(10).iplot('bar')
Copy the code
output
We can also draw stacked histograms
df.head(10).iplot(kind='bar',barmode='stack')
Copy the code
output
So similarly, we can draw the histogram horizontally
df.head(10).iplot(kind='barh',barmode='stack')
Copy the code
output
The line chart
Let’s take a look at the line graph drawing. We first do an accumulation for each column of the df data set above
df3 = df.cumsum()
Copy the code
And then let’s draw a line chart
df3.iplot()
Copy the code
output
Of course, you can also filter out a few columns and draw them, as follows
df3[["A", "B"]].iplot()
Copy the code
output
We can also draw a straight line that fits the trend of a broken line,
df3['A'].iplot(bestfit = True,bestfit_colors=['pink'])
Copy the code
output
Here we focus on a common parameter in the iplot() method
kind
: Chart type, default isscatter
, scatter types, bar, Box, HeatMap, etctheme
: Layout theme can be passedcf.getThemes()
To see what the main ones aretitle
: The title of the chartxTitle/yTitle
: the name of the axis above the x or y axiscolors
: The color of the diagramsubplots
: Boolean value used when drawing subgraphs. The default value is BooleanFalse
mode
:string
, drawing mode, can havelines
,markers
, there arelines+markers
andlines+text
Such as modelsize
: Used to adjust the size of scattered points in scatter graphsshape
: The layout of each diagram when drawing subgraphsbargap
: The distance between columns in a histogrambarmode
: histogram form, stack, group, overlay
Area chart
The conversion from line to area is very simple, just set the parameter fill to True, as shown below
df3.iplot(fill = True)
Copy the code
output
A scatter diagram
For scatterplot drawing, we need to set mode to marker, the code is as follows
df3.iplot(kind='scatter',x='A',y='B',
mode='markers',size=10)
Copy the code
output
We can adjust the size of the scatter by adjusting the size parameter, for example, we can adjust size to 20
df3.iplot(kind='scatter',x='A',y='B',
mode='markers',size=20)
Copy the code
output
Or set the mode to lines+markers as follows
df3.iplot(kind='scatter',x='A',y='B',
mode='lines + markers',size=10)
Copy the code
We can also specify the shape of the scatter, as shown in the following code
df3.iplot(kind='scatter',x='A',y='B',
mode='markers',size=20,symbol="x",
colorscale='paired',)
Copy the code
output
Of course we can also set the color of the scatter
df.iplot(kind='scatter' ,mode='markers',
symbol='square',colors=['orange','purple','blue','red'],
size=20)
Copy the code
output
Bubble chart
The presentation of the bubble graph is similar to that of the scatter graph. Change the kind parameter to bubble in the drawing. Suppose we have such a set of data
cf.datagen.bubble(prefix='industry').head()
Copy the code
output
X y size text Categories 0 0.332274 1.053811 2 lcn.cg industry1 1-0.856835 0.422373 87 zky.xc industry1 2-0.818344 -0.167020 72 zsj.dj Industry1 3 -0.720254 0.458264 11 ong.sm Industry1 4 -0.004744 0.644006 40 huw.dn Industry1Copy the code
So let’s draw a bubble diagram
cf.datagen.bubble(prefix='industry').iplot(kind='bubble',x='x',y='y',size='size', categories='categories',text='text', XTitle ='Returns', yTitle='Analyst Score',title='Cufflinks - bubble chart ')Copy the code
output
The difference between a bubble diagram and a scatter diagram is that every point in a scatter diagram is the same size, whereas a bubble diagram is not
3 d scatter plot
Now that we’ve mentioned the bubble plot, let’s just mention the 3D scatter plot, assuming our data looks like this
Cf. Datagen. Scatter3d (2150). The head ()Copy the code
output
X y z text Categories 0 0.375359-0.683845-0.960599 re.jd category1 1 0.635806 1.210649 0.319687 inm. LE category1 2 Cation1 0.578831 0.103654 1.333646 BSZ.HS relation1 3-1.128907-1.189098 1.531494 GJZ.UX relation1 4 0.067668-1.990996 0.088281 IQZ. KS category1Copy the code
Let’s draw the 3D bubble graph. Since it is three-dimensional, it means that there are x, Y and Z axes. The code is as follows
Cf. Datagen. Scatter3d (2150). Iplot (kind = 'scatter3d', x = 'x', y = 'y', z = 'z', size = 15, categories = 'categories', the text =' text ', Title ='Cufflinks - 3D bubble chart ',colors=['yellow','purple'], width=1,margin=(0,0,0,0), opacity=1)Copy the code
output
3 d bubble chart
So when we say 3D scatter plot, we have to say 3D bubble plot, let’s say our data set looks like this
Cf. Datagen. Bubble3d (5, 4). The head ()Copy the code
output
X y z size text Categories 0-1.888528 0.801430-0.493671 77 okc.hl category1 1-0.744953-0.004398-1.249949 61 gag.uh Cati1 2 0.980846 1.241730-0.741482 37 LVB.em Cati1 3-0.230157 0.427072 0.007010 78 Nwz.mg cati1 4 0.025272 -0.424051-0.602937 76 JDw.ax category2Copy the code
Let’s draw a 3D bubble diagram
Cf. Datagen. Bubble3d (5, 4). The iplot (kind = 'bubble3d' = 'x' x, y = 'y', z = "z", size = 'size', the text = 'text', categories = 'categories', Title ='Cufflinks - 3D bubble chart ', colorScale ='set1', width=.9,opacity=0.9)Copy the code
output
Box figure
Next, let’s look at the drawing of the box graph. The box graph is very helpful for us to observe the distribution of data and whether there is an extreme value
df.iplot(kind = "box")
Copy the code
output
Heat map
This is the heat map, so let’s look at the data set
Cf. Datagen. Heatmap (20, 20). The head ()Copy the code
output
y_0 y_1 y_2 ... Y_17 y_18 y_19 X_0 40.000000 58.195525 55.355233... 77.318287 80.187609 78.959951 x_1 37.111934 25.068114 25.730511... 27.261941 32.303315 28.550340 x_2 54.881357 54.254479 59.434281... 75.894161 74.051203 72.896999 x_3 41.337221 39.319033 37.916613... 15.885289 29.404226 26.278611 X_4 42.862472 36.365226 37.959368... 24.998608 25.096598 32.413760Copy the code
So let’s plot the heat map, as follows
Iplot (kind='heatmap',colorscale='spectral',title='Cufflinks - thermal map')Copy the code
output
Trend chart
A trend chart is basically a combination of a line chart and an area chart, coded as follows
df[["A", "B"]].iplot(kind = 'spread')
Copy the code
output
The pie chart
Let’s look at the pie chart drawing, the code is as follows
cf.datagen.pie(n_labels=6, mode = "stocks").iplot(
kind = "pie",
labels = "labels",
values = "values")
Copy the code
output
K line graph
Cufflinks can also be used to plot k-lines, so let’s look at the data set here
cf.datagen.ohlc().head()
Copy the code
output
Open High Low CLOSE 2015-01-01 100.000000 119.144561 97.305961 106.125985 2015-01-02 106.131897 118.814224 96.740816 115.124342 2015-01-03 116.091647 131.477558 115.801048 126.913591 2015-01-04 128.589287 144.116844 117.837221 136.332657 2015-01-05 134.809052 138.681252 118.273850 120.252828Copy the code
As can be seen from the above data set, there is an opening price, a closing price, a high/low price, and then we plot the K-plot
Cf.datage.ohlc ().iplot(kind =" ohlc",xTitle =" date ", yTitle=" price ",title =" K plot ")Copy the code
output
histogram
df = pd.DataFrame({'a': np.random.randn(1000) + 1, 'b': np.random.randn(1000),
'c': np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
df.iplot(kind = "histogram")
Copy the code
output
The drawing of multiple subgraphs
Then let’s look at drawing multiple subgraphs, one using the scatter_matrix() method
df = pd.DataFrame(np.random.randn(1000, 4),
columns=['a', 'b', 'c', 'd'])
df.scatter_matrix()
Copy the code
output
The other is to usesubplots
Parameter, and sets its parameter toTrue
, for example, let’s draw multiple histogram subgraphs
df_h=cf.datagen.histogram(4)
df_h.iplot(kind='histogram',subplots=True,bins=50)
Copy the code
output
Or draw multiple line chart subgraphs
df=cf.datagen.lines(4)
df.iplot(subplots=True,subplot_titles=True,legend=True)
Copy the code
output
Finally, we have the freedom to combine multiple submap renderings with specs inside
Figs =cf. Figures (df,[dict(kind='histogram',keys='x',color='blue'), dict(kind='scatter',mode='markers',x='x',y='y',size=5), dict(kind='scatter',mode='markers',x='x',y='y',size=5,color='teal')],asList=True) figs.append(cf.datagen.lines(1).figure(bestfit=True,colors=['blue'],bestfit_colors=['red'])) Base_layout =cf. Tools. get_base_layout(figs) # Distribution of specs=cf. Subplots (figs,shape=(3,2),base_layout=base_layout,vertical_spacing=.25,horizontal_spacing=.04, specs=[[{'rowspan':2},{}],[None,{}],[{'colspan':2},None]], Subplot_titles =[' histogram ',' scatter graph _1',' scatter graph _2',' line graph + fit line ']) specs['layout']. Update (showlegend=True) cf. Iplot (specs)Copy the code
output