Create 30 dynamic interactive charts for Pandas using one key

Today I’m going to show you how to generate cool dynamic interactive charts in a DataFrame dataset with one line of code, and we’re going to start by introducing the module CuffLinks which is like Seaborn wrapping matplotlib, Cufflinks has also made further packaging and optimization for Plotly, making it uniform, easy to configure parameters, and flexible for plotting DataFrame datasets

The line chart
Area chart
A scatter diagram
A histogram
histogram
Box figure
Heat map
3D scatter chart /3D bubble chart
Trend chart
The pie chart
K line graph
Multiple subgraphs are combined

Module installation

When it comes to installation, simply PIP install

pip install cufflinks
Copy the code

Import the module and view its configuration

Let’s import this module to see what the current version is

cf.__version__
Copy the code

output

'0.17.3'
Copy the code

At present, the version of this module has reached 0.17.3, which is also the latest version. Then, what charts can be drawn in our latest version

cf.help()
Copy the code

output

Use 'cufflinks.help(figure)' to see the list of available parameters for the given figure.
Use 'DataFrame.iplot(kind=figure)' to plot the respective figure
Figures:
 bar
 box
 bubble
 bubble3d
 candle
 choroplet
 distplot
 .......
Copy the code

From the output above we can see that the general syntax for plotting a chart is df.iplot(kind= chart name). How do we want to view the parameters of a particular chart at the time of plotting, for example, what are the bar parameters

cf.help('bar')
Copy the code

A histogram

Let’s take a look at histogram charting and start by creating a data set for charting

Df2 = pd. DataFrame ({' Category: [' A ', 'B', 'C', 'D'], 'Values' :,56,70,85 [95]}) df2Copy the code

output

  Category  Values
0        A      95
1        B      56
2        C      70
3        D      85
Copy the code

And then we’ll plot the histogram

Df2. iplot(kind='bar',x='Category',y='Values', xTitle = "Category",yTitle = "Values", title = "histogram ") df2.iplot(kind='bar',x='Category',y='Values', xTitle = "Category", title = "Values", title =" histogram ")Copy the code

output

One of thexThe parameter is filled withxThe name of the variable above the axis, andyThe parameter isyThe corresponding variable name above the axis, we can draw the graph topngAnd download it,

We can also zoom in on the diagram,

Let’s take a look at the following data set

Df.dataframe (np.random. Randn (100,4),columns='A B C D'.split()) df.head()Copy the code

output

A B C D 0 1.167609 1.528045-0.498168-0.221060 2-1.338883-0.732692 0.935410 1 1.167609 1.528045-0.498168-0.221060 2-1.338883-0.732692 0.935410 0.338740 3 1.662209 0.269750-1.026117-0.858472 4 1.387077-0.839192-0.562382-0.989672Copy the code

Let’s draw a diagram of a histogram

df.head(10).iplot('bar')
Copy the code

output

We can also draw stacked histograms

df.head(10).iplot(kind='bar',barmode='stack')
Copy the code

output

So similarly, we can draw the histogram horizontally

df.head(10).iplot(kind='barh',barmode='stack')
Copy the code

output

The line chart

Let’s take a look at the line graph drawing. We first do an accumulation for each column of the df data set above

df3 = df.cumsum()
Copy the code

And then let’s draw a line chart

df3.iplot()
Copy the code

output

Of course, you can also filter out a few columns and draw them, as follows

df3[["A", "B"]].iplot()
Copy the code

output

We can also draw a straight line that fits the trend of a broken line,

df3['A'].iplot(bestfit = True,bestfit_colors=['pink'])
Copy the code

output

Here we focus on a common parameter in the iplot() method

kind: Chart type, default isscatter, scatter types, bar, Box, HeatMap, etc
theme: Layout theme can be passedcf.getThemes()To see what the main ones are
title: The title of the chart
xTitle/yTitle: the name of the axis above the x or y axis
colors: The color of the diagram
subplots: Boolean value used when drawing subgraphs. The default value is BooleanFalse
mode: string, drawing mode, can havelines,markers, there arelines+markersandlines+textSuch as model
size: Used to adjust the size of scattered points in scatter graphs
shape: The layout of each diagram when drawing subgraphs
bargap: The distance between columns in a histogram
barmode: histogram form, stack, group, overlay

Area chart

The conversion from line to area is very simple, just set the parameter fill to True, as shown below

df3.iplot(fill = True)
Copy the code

output

A scatter diagram

For scatterplot drawing, we need to set mode to marker, the code is as follows

df3.iplot(kind='scatter',x='A',y='B',
          mode='markers',size=10)
Copy the code

output

We can adjust the size of the scatter by adjusting the size parameter, for example, we can adjust size to 20

df3.iplot(kind='scatter',x='A',y='B',
          mode='markers',size=20)
Copy the code

output

Or set the mode to lines+markers as follows

df3.iplot(kind='scatter',x='A',y='B',
          mode='lines + markers',size=10)
Copy the code

We can also specify the shape of the scatter, as shown in the following code

df3.iplot(kind='scatter',x='A',y='B',
          mode='markers',size=20,symbol="x",
          colorscale='paired',)
Copy the code

output

Of course we can also set the color of the scatter

df.iplot(kind='scatter' ,mode='markers',
         symbol='square',colors=['orange','purple','blue','red'],
         size=20)
Copy the code

output

Bubble chart

The presentation of the bubble graph is similar to that of the scatter graph. Change the kind parameter to bubble in the drawing. Suppose we have such a set of data

cf.datagen.bubble(prefix='industry').head()
Copy the code

output

X y size text Categories 0 0.332274 1.053811 2 lcn.cg industry1 1-0.856835 0.422373 87 zky.xc industry1 2-0.818344 -0.167020 72 zsj.dj Industry1 3 -0.720254 0.458264 11 ong.sm Industry1 4 -0.004744 0.644006 40 huw.dn Industry1Copy the code

So let’s draw a bubble diagram

cf.datagen.bubble(prefix='industry').iplot(kind='bubble',x='x',y='y',size='size', categories='categories',text='text', XTitle ='Returns', yTitle='Analyst Score',title='Cufflinks - bubble chart ')Copy the code

output

The difference between a bubble diagram and a scatter diagram is that every point in a scatter diagram is the same size, whereas a bubble diagram is not

3 d scatter plot

Now that we’ve mentioned the bubble plot, let’s just mention the 3D scatter plot, assuming our data looks like this

Cf. Datagen. Scatter3d (2150). The head ()Copy the code

output

X y z text Categories 0 0.375359-0.683845-0.960599 re.jd category1 1 0.635806 1.210649 0.319687 inm. LE category1 2 Cation1 0.578831 0.103654 1.333646 BSZ.HS relation1 3-1.128907-1.189098 1.531494 GJZ.UX relation1 4 0.067668-1.990996 0.088281 IQZ. KS category1Copy the code

Let’s draw the 3D bubble graph. Since it is three-dimensional, it means that there are x, Y and Z axes. The code is as follows

Cf. Datagen. Scatter3d (2150). Iplot (kind = 'scatter3d', x = 'x', y = 'y', z = 'z', size = 15, categories = 'categories', the text =' text ', Title ='Cufflinks - 3D bubble chart ',colors=['yellow','purple'], width=1,margin=(0,0,0,0), opacity=1)Copy the code

output

3 d bubble chart

So when we say 3D scatter plot, we have to say 3D bubble plot, let’s say our data set looks like this

Cf. Datagen. Bubble3d (5, 4). The head ()Copy the code

output

X y z size text Categories 0-1.888528 0.801430-0.493671 77 okc.hl category1 1-0.744953-0.004398-1.249949 61 gag.uh Cati1 2 0.980846 1.241730-0.741482 37 LVB.em Cati1 3-0.230157 0.427072 0.007010 78 Nwz.mg cati1 4 0.025272 -0.424051-0.602937 76 JDw.ax category2Copy the code

Let’s draw a 3D bubble diagram

Cf. Datagen. Bubble3d (5, 4). The iplot (kind = 'bubble3d' = 'x' x, y = 'y', z = "z", size = 'size', the text = 'text', categories = 'categories', Title ='Cufflinks - 3D bubble chart ', colorScale ='set1', width=.9,opacity=0.9)Copy the code

output

Box figure

Next, let’s look at the drawing of the box graph. The box graph is very helpful for us to observe the distribution of data and whether there is an extreme value

df.iplot(kind = "box")
Copy the code

output

Heat map

This is the heat map, so let’s look at the data set

Cf. Datagen. Heatmap (20, 20). The head ()Copy the code

output

y_0 y_1 y_2 ... Y_17 y_18 y_19 X_0 40.000000 58.195525 55.355233... 77.318287 80.187609 78.959951 x_1 37.111934 25.068114 25.730511... 27.261941 32.303315 28.550340 x_2 54.881357 54.254479 59.434281... 75.894161 74.051203 72.896999 x_3 41.337221 39.319033 37.916613... 15.885289 29.404226 26.278611 X_4 42.862472 36.365226 37.959368... 24.998608 25.096598 32.413760Copy the code

So let’s plot the heat map, as follows

Iplot (kind='heatmap',colorscale='spectral',title='Cufflinks - thermal map')Copy the code

output

Trend chart

A trend chart is basically a combination of a line chart and an area chart, coded as follows

df[["A", "B"]].iplot(kind = 'spread')
Copy the code

output

The pie chart

Let’s look at the pie chart drawing, the code is as follows

cf.datagen.pie(n_labels=6, mode = "stocks").iplot(
    kind = "pie",
    labels = "labels",
    values = "values")
Copy the code

output

K line graph

Cufflinks can also be used to plot k-lines, so let’s look at the data set here

cf.datagen.ohlc().head()
Copy the code

output

Open High Low CLOSE 2015-01-01 100.000000 119.144561 97.305961 106.125985 2015-01-02 106.131897 118.814224 96.740816 115.124342 2015-01-03 116.091647 131.477558 115.801048 126.913591 2015-01-04 128.589287 144.116844 117.837221 136.332657 2015-01-05 134.809052 138.681252 118.273850 120.252828Copy the code

As can be seen from the above data set, there is an opening price, a closing price, a high/low price, and then we plot the K-plot

Cf.datage.ohlc ().iplot(kind =" ohlc",xTitle =" date ", yTitle=" price ",title =" K plot ")Copy the code

output

histogram

df = pd.DataFrame({'a': np.random.randn(1000) + 1, 'b': np.random.randn(1000),
                    'c': np.random.randn(1000) - 1}, columns=['a', 'b', 'c'])
df.iplot(kind = "histogram")
Copy the code

output

The drawing of multiple subgraphs

Then let’s look at drawing multiple subgraphs, one using the scatter_matrix() method

df = pd.DataFrame(np.random.randn(1000, 4),
                  columns=['a', 'b', 'c', 'd'])
df.scatter_matrix()
Copy the code

output

The other is to usesubplotsParameter, and sets its parameter toTrue, for example, let’s draw multiple histogram subgraphs

df_h=cf.datagen.histogram(4)
df_h.iplot(kind='histogram',subplots=True,bins=50)
Copy the code

output

Or draw multiple line chart subgraphs

df=cf.datagen.lines(4)
df.iplot(subplots=True,subplot_titles=True,legend=True)
Copy the code

output

Finally, we have the freedom to combine multiple submap renderings with specs inside

Figs =cf. Figures (df,[dict(kind='histogram',keys='x',color='blue'),  dict(kind='scatter',mode='markers',x='x',y='y',size=5), dict(kind='scatter',mode='markers',x='x',y='y',size=5,color='teal')],asList=True) figs.append(cf.datagen.lines(1).figure(bestfit=True,colors=['blue'],bestfit_colors=['red'])) Base_layout =cf. Tools. get_base_layout(figs) # Distribution of specs=cf. Subplots (figs,shape=(3,2),base_layout=base_layout,vertical_spacing=.25,horizontal_spacing=.04, specs=[[{'rowspan':2},{}],[None,{}],[{'colspan':2},None]], Subplot_titles =[' histogram ',' scatter graph _1',' scatter graph _2',' line graph + fit line ']) specs['layout']. Update (showlegend=True) cf. Iplot (specs)Copy the code

output

Create 30 dynamic interactive charts for Pandas using one key

Module installation

Import the module and view its configuration

A histogram

The line chart

Area chart

A scatter diagram

Bubble chart

3 d scatter plot

3 d bubble chart

Box figure

Heat map

Trend chart

The pie chart

K line graph

histogram

The drawing of multiple subgraphs

Related Posts

Ta summary – Week 6, 2020-3-29

Pandas has multiple worksheets and workbooks

The difference between JDK, JRE and JVM?