Wechat official account: “Python reading money” if you have any questions or suggestions, please leave a message
Seaborn is a Python visualization library based on Matplotlib. It provides an advanced interface to draw attractive statistical graphs. Seaborn is actually a more advanced API package based on Matplotlib, which makes it much easier to draw and refine your drawings without a lot of tweaking.
Note: All code is implemented in IPython Notebook
Lmplot (regression diagram)
Lmplot is used to draw regression graph, through which we can intuitively overview the internal relationship of data
An overview of the Stripplot API:
seaborn.lmplot(x, y, data, hue=None, col=None, row=None, palette=None, col_wrap=None, size=5, aspect=1, markers=’o’, sharex=True, sharey=True, hue_order=None, col_order=None, row_order=None, legend=True, legend_out=True, x_estimator=None, x_bins=None, x_ci=’ci’, scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=False, x_jitter=None, y_jitter=None, scatter_kws=None, line_kws=None)
It can be seen that lmplot has a lot of parameters. Next, we will select some commonly used ones and talk about them. Some parameters will involve some knowledge of statistics.
Old routine, the pilot into the corresponding package:
Import seaborn as SNS %matplotlib inline sns.set(font_scale=1.5,style="white")
Copy the code
The data set for this trial is Seaborn’s built-in Tips data set:
data=sns.load_dataset("tips")
data.head(5)
Copy the code
Let’s take a look at lmplot first
sns.lmplot(x="total_bill",y="tip",data=data)
Copy the code
It can be seen that LMPLOT conducts unary linear regression on the selected data set, fitting an optimal straight line.
Let’s move on to a demonstration of specific parameters.
Col: Categorizes columns according to specified attributes
Row: Categorizes rows based on specified attributes
sns.lmplot(x="total_bill",y="tip",data=data,row="sex",col="smoker")
Copy the code
In combination with our data set, we can see how these two parameters are used by looking at the horizontal and vertical coordinates above
Col_wrap: Specifies the number of columns per row, at most equal to the number of different categories for the COL argument
sns.lmplot(x="total_bill",y="tip",data=data,col="day",col_wrap=4)
Copy the code
sns.lmplot(x="total_bill",y="tip",data=data,col="day",col_wrap=2)
Copy the code
Aspect: Controls the aspect ratio of the diagram
sns.lmplot(x="total_bill",y="tip",data=data,aspect=1)
The ratio of length to width is equal to one to one, i.e. square
Copy the code
sns.lmplot(x="total_bill",y="tip", data = data, aspect = 1.5)# Length over width is equal to 1:1.5, as you can see the horizontal axis is a bit longer
Copy the code
Sharex: X scale (default True)
Sharey: sharey scale (default True)
sns.lmplot(x="total_bill",y="tip",data=data,row="sex",col="smoker",sharex=False)
# You can see that when set to False, the 5# scale of the x axis is different for each subgraph
Copy the code
Hue: Used for classification
sns.lmplot(x="total_bill",y="tip",data=data,hue="sex",palette="husl")
Copy the code
Ci: Confidence intervals for controlling regression (those of you who have studied statistics know this)
# show a confidence interval of 0.95
sns.lmplot(x="total_bill",y="tip",data=data,ci=95)
Copy the code
X_jitter: Randomly add noise points to the X-axis
Y_jitter: Randomly add noise points to the Y-axis
Setting these two parameters does not affect the final regression line
sns.lmplot(x="size",y="tip",data=data,x_jitter=False)
Copy the code
sns.lmplot(x="size",y="tip",data=data,x_jitter=True)
You can see that the data points in the previous column were randomly selected
# Scrambled, but does not affect the final regression line
Copy the code
Order: controls the power of regression performed (more than one is polynomial regression)
sns.lmplot(x="total_bill",y="tip",data=data,order=1) # Unary linear regression
Copy the code
sns.lmplot(x="total_bill",y="tip",data=data,order=2) The maximum number of times is 2
Copy the code
sns.lmplot(x="total_bill",y="tip",data=data,order=3) The maximum number of times is 3
Copy the code
There are also some parameters involved in more in-depth statistical knowledge, here is not an introduction, interested in the official document can check!
Follow the public account “Python read money”, the background reply “py” to get Python learning resources package, and Python learning exchange group!