Public account: You and the cabin by: Peter Editor: Peter

Plotly Play histogram _12

Hello, I’m Peter

One of the statistical graphs is called a histogram, which consists of a one-dimensional histogram and a two-dimensional histogram (also called density histogram). This paper first introduces the production of one-dimensional histogram, mainly based on two methods:

  • Based on the plotly_express
  • Based on the plotly graph_objects

Plotly series

The First 11 Plotly visualization articles are as follows:

  • Cool! Love advanced visualization artifact Plotly_Express
  • Plotly play scatter chart
  • Plotly plays a pie chart
  • Plotly plays a funnel
  • Plotly play bar chart
  • Plotly play bubble chart
  • Plotly play stock chart
  • Plotly plays the Gantt chart
  • Plotly
  • Plotly play area map
  • Plotly playing the violin

Histogram effect

To quote the definition of histogram from Baidu Baike:

Histogram, also known as mass distribution, is a statistical report graph, which consists of a series of vertical stripes or line segments of varying heights that show the distribution of data. Data types are generally represented on the horizontal axis and distribution on the vertical axis.

The two coordinates of the histogram are the measurement of a statistical sample and an attribute corresponding to the sample, which is embodied in the form of a bar graph.

Simulated data

The graphs in this paper are drawn mainly based on the consumption data set tips in Plotly. The main fields include:

  • Total amount spent: total_bill
  • Tip: tip
  • Gender of payer: Sex
  • Whether or not the payer smoked: Smoker
  • Date: day
  • Dinner time: Lunch or dinner
  • Number of diners: size
import plotly.express as px
import numpy as np
import plotly.graph_objects as go

tips = px.data.tips()
tips.head()
Copy the code

Implementation based on Plotly_express

Base histogram

fig = px.histogram(tips, x="total_bill")
fig.show()
Copy the code

Sets part of the histogram element

fig = px.histogram(
    tips, 
    x="tip",
    title='Set histogram elements'.# titles
    labels={'tip':'tip'},   # X-axis label setting
    opacity=0.8.# Graphic transparency
    log_y=True.Log of values
    color_discrete_sequence=['firebrick'] # Color selection
    )

fig.show()
Copy the code

Use the category value of the field as the X-axis

In the above example, we can see that the X-axis data are all numeric. In fact, we can also use different categories as X-axis label labels:

fig = px.histogram(  # The number of occurrences in the histogram is automatically counted
    tips, 
    x="day")

fig.show()
Copy the code

Number of custom blocks bins

fig = px.histogram(
    tips, 
    x="total_bill", 
    nbins=20)  # Number of fields Indicates the number of user-defined blocks

fig.show()
Copy the code

Graphic standardization mode selection

There are several different ways to standardize for each histogram:

'percent'.'probability'.'density'.'probability density'
Copy the code
fig = px.histogram(
   tips, 
   x="total_bill", 
   histnorm='percent'  # Choose standardization
)

fig.show()
Copy the code

Grouping histogram

Histogram is drawn in groups according to different values of fields. The day field has four different values:

fig = px.histogram(
    tips, 
    x="total_bill", 
    color="day")  The # day attribute has four values

fig.show()
Copy the code

fig = px.histogram(
    tips, 
    x="tip", 
    color="sex")  The # sex attribute has two values

fig.show()
Copy the code

Different aggregate functions used

The default aggregate function used in the histogram is count, and other aggregate functions can be used

fig = px.histogram(
    tips,
    x="tip".# Block
    y="total_bill", 
    histfunc='avg')  The mean #

fig.show()
Copy the code

Visualize histogram distribution

After the basic histogram is drawn, we can also draw relevant graphs at the edge of the whole canvas to assist in displaying the distribution law of graphs. The marginal parameter is marginal, and other graphs are used to show the law of data:

fig = px.histogram(
    tips, 
    x="tip", 
    color="sex".# Color group
    marginal="rug".# Optional: 'rug', 'box', 'violin'
    hover_data=tips.columns
)

fig.show()
Copy the code

Based on go.Histogram

Basic shapes

x = tips["total_bill"].tolist()

# equivalent to px. Histogram (tips, x="total_bill")
Pass in the value of x directly
fig = go.Figure(data=[go.Histogram(x=x)])  

fig.show()
Copy the code

Graphic standardization

The graphics are standardized in the same way as they were with Plotly_Express:

import plotly.graph_objects as go
import numpy as np

x = np.random.randn(1000)
fig = go.Figure(data=[go.Histogram(
    x=x, 
    histnorm='probability density'  Select 'percent', 'probability', 'density', 'probability density'
)])

fig.show()
Copy the code

Horizontal histogram

A horizontal histogram can be plotted using the incoming data as the value of the Y-axis:

import plotly.graph_objects as go
import numpy as np

y = np.random.randn(1000)  Generate 1000 normally distributed random numbers

fig = go.Figure(data=[go.Histogram(
    y=y,  The # becomes the value of y
    histnorm='probability density'  Select 'percent', 'probability', 'density', 'probability density'
)])

fig.show()
Copy the code

Overlay mode for multiple histograms

import plotly.graph_objects as go

import numpy as np

x0 = np.random.randn(300)
x1 = np.random.randn(300) + 1.5
x2 = np.random.randn(300) - 1.5


fig = go.Figure()
fig.add_trace(go.Histogram(x=x0))
fig.add_trace(go.Histogram(x=x1))
fig.add_trace(go.Histogram(x=x2))

Set overlay mode
fig.update_layout(barmode='overlay')  # Important parameters
# set transparency
fig.update_traces(opacity=0.8)
fig.show()
Copy the code

Stack mode for multiple histograms

import plotly.graph_objects as go
import numpy as np

Generate 3 groups of data randomly
x0 = np.random.randn(300)
x1 = np.random.randn(300) + 1
x2 = np.random.randn(300) + 1.5


fig = go.Figure()
fig.add_trace(go.Histogram(x=x0))
fig.add_trace(go.Histogram(x=x1))
fig.add_trace(go.Histogram(x=x2))

# Set stack mode
fig.update_layout(barmode='stack')
# set transparency
fig.update_traces(opacity=0.8)
fig.show()
Copy the code

Specifying aggregate functions

For the same data, different aggregate functions can be specified in different trajectories. In the following example, one trajectory is set to count and the other trajectory to sum

import plotly.graph_objects as go

x = ["Xiao Ming"."Little red"."Little red"."Little su"."Xiao Ming"."Little su"]
y = ["85"."100"."132"."110"."95"."120"]

fig = go.Figure()
fig.add_trace(go.Histogram(histfunc="count".Aggregate function that specifies the number of statistics
                           y=y, x=x, name="Statistics"))
fig.add_trace(go.Histogram(histfunc="sum".# specify the summation function
                           y=y, x=x, name="Peace"))

fig.show()
Copy the code

Cumulative Histogram Histogram

Cumulative_enabled Enables the cumulative function. By default, the cumulative function is enabled.

x = list(range(101))

fig = go.Figure(data=[go.Histogram(
    x=x,  # X-axis data
    cumulative_enabled=True
)])  # Enable the accumulative function

fig.show()
Copy the code

Personalization histogram

In the following example, two histograms are drawn, using data x0 and X1 as shown below:

fig = go.Figure()

# Histogram 1
fig.add_trace(go.Histogram(
    x=x0,  # X-axis data
    histnorm='percent'.# Standardized method
    name='Histogram 1'.# 
    xbins=dict( # X-axis starting value and block size size
        start=-4.0,  
        end=3.0,
        size=0.5
    ),
    marker_color='#0B89B5'.# mark color
    opacity=0.75  # transparency
))

# Histogram 2
fig.add_trace(go.Histogram(
    x=x1,
    histnorm='percent',
    name='Histogram 2',
    xbins=dict(
        start=-3.0,
        end=4,
        size=0.5
    ),
    marker_color='#830A73',
    opacity=0.75
))

fig.update_layout(
    title=dict(text=' Personalize  Set histogram '.# title name position; Use HTML tags in headings
               x=0.5,
               y=0.97
              ), 
    xaxis_title=dict(text='value'), # xy axis label set
    yaxis_title_text='count'.# default aggregate function count
    bargap=0.5.# Distance between groups
    bargroupgap=0.3 # Group distance
)

fig.show()
Copy the code