preface

Data provided: data

What can bubble charts do?

The Bubble chart can be used to show the relationship between three variables. It is similar to a scatter plot in that one variable is plotted on the horizontal axis, another on the vertical axis, and the third variable is represented by the size of the bubble. Data arranged in columns of a worksheet (x values listed in the first column and corresponding Y values and bubble size values listed in adjacent columns) can be plotted in a bubble diagram. Bubble plots are similar to scatter plots except that bubble plots allow an additional size variable to be added to the chart for comparison. (From Baidu Baike)

In fact, bubble graph is on the basis of scatter graph to increase the size of the data comparison, often used in some problems.

Second, draw bubble map

Requirements are put forward for the data: draw the bubble chart of monthly average trading volume, with the horizontal axis as time and the vertical axis as secondary categories of commodities

  1. Load the necessary packages
import pandas as pd
import matplotlib.pyplot as plt
Copy the code
  1. Read the data
path='.. / data/data. CSV '
f=open(path)
data=pd.read_csv(f)
data=data.loc[:,['Time of payment'.'value'.'Commodity Class II']] The bubble graph is only related to these three properties, which are extracted for convenience
Copy the code
  1. Grouping amount
data['Time of payment'] = pd.to_datetime(data['Time of payment'])
group=data.groupby(['Commodity Class II',data['Time of payment'].dt.month])['value'].mean()
Copy the code

The groupby function is a frequently used grouping function. If you are not sure about the groupby function, you can refer to this blog: Groupby function for detailed explanation. The following is the average value of transactions calculated by groups.

As can be seen from the result, the grouping basis given by us is “secondary class” and “month of payment time” respectively in the tuple type format in the index.

  1. Process tuple data
# Handle tuple data, which contains secondary classes and months, essentially separating two attributes to form a list
m=list(map(list,zip(*list(group.index))))  

# Separate the three types of data
x=m[1]  # Monthly data
y=m[0]  # 2 class
z=group.values # Total transaction amount
Copy the code

There is no need to use such a troublesome method to draw bubble graph. If you have a better solution, please kindly advise.

  1. Bubble plot
names = ['一月'.'二月'.'march'.'in April']
plt.scatter(x,y,s=z*10,alpha=0.6,c=x) # The first two parameters are horizontal and vertical coordinates, and the third parameter is the data size to control the size of the bubble, which can be uniformly expanded by a certain multiple. Alpha stands for transparency. C represents the bubble color, and let c=x represent the number of colors matching the abscissa.
plt.xticks(x, names)
plt.xlabel("Month") # X axis labels
plt.ylabel("Class II") # Y axis labels
plt.show()
Copy the code

As can be seen from the chart, cigarette products ranked the top in sales in all four months, followed by candied fruit/dried fruit products. Sales of confectionery/chocolate products were higher in January and March and less so in February and April.

Code summary:

import pandas as pd
import matplotlib.pyplot as plt

path='.. / data/data. CSV '
f=open(path)
data=pd.read_csv(f)
data=data.loc[:,['Time of payment'.'value'.'Commodity Class II']]

data['Time of payment'] = pd.to_datetime(data['Time of payment'])
group=data.groupby(['Commodity Class II',data['Time of payment'].dt.month])['value'].mean()

# Handle tuple data, which contains secondary classes and months, essentially separating two attributes to form a list
m=list(map(list,zip(*list(group.index))))  

# Separate the three types of data
x=m[1]  # Monthly data
y=m[0]  # 2 class
z=group.values # Total transaction amount

names = ['一月'.'二月'.'march'.'in April']
plt.scatter(x,y,s=z*10,alpha=0.6,c=x)
plt.xticks(x, names)
plt.xlabel("Month") # X axis labels
plt.ylabel("Class II") # Y axis labels
plt.show()
f.close()
Copy the code

Other issues

data

Python drawing Chinese font does not display problems

Data analysis practice – bar chart

Data analysis practice – line chart

Data analysis practice – pie chart

Data analysis – bubble chart

Data analysis actual combat – thermal map

reference

Matplotlib official information