This paper will use scatter function in the Matplotlib. pyplot module to draw scatter plots. The following code blocks are used to import each library needed. Make_blobs function is used to generate data, and you can see the first 5 lines of df generated by us
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import make_blobs
import numpy as np
Copy the code
data=make_blobs(n_samples=150,n_features=4,centers=3,random_state=20220203)
df=pd.DataFrame(data[0],columns=['v1'.'v2'.'v3'.'v4'])
df['target']=data[1]
Copy the code
df.head()
Copy the code
v1 | v2 | v3 | v4 | target | |
---|---|---|---|---|---|
0 | 1.046182 | 1.747420 | 1.680394 | 6.627963 | 2 |
1 | 3.654120 | 2.169111 | 1.494711 | 4.880651 | 2 |
2 | 2.227285 | 0.497507 | 2.115962 | 8.216452 | 2 |
3 | 3.995025 | 1.423546 | 0.993872 | 5.559633 | 2 |
4 | 1.994595 | 1.276570 | 0.905848 | 7.997673 | 2 |
First of all, we explain each parameter, so that we can optimize the image.
PLT. Scatter () parameters:
X,y: Input sequence to draw
S: Draws the size of the scatter: can be a scalar or vector
C: Color sequence or single color:
1 is a scalar or sequence of the same length as x used to convert the following CMAP or norm parameters into colors. 2 is a 2D RGB or RGBA array. 3 is a color sequence of the same length as input X. 4 is a string used to set colorsCopy the code
Marker: Shape of scatter, default to “O”
Cmap: a string of specific chromatographic names or chromatographic elements, used if and only if C is a column of floating-point numbers
Norm: Used to normalize sequences of input C floating-point types. If not, the default normalization is used
Vmin, vmax: floating point number used to normalize color vectors when no norm parameter is added.
Alpha: color transparency
Linewidths: Scatter boundary width, floating point or vector.
Edgecolors: The edgecolor of the dot. The default is the interior color of the dot. It can also be NONE, or a sequence.
1.1 Primary Drawing
The first one is logarithm
plt.scatter(x=df['v1'],y=df['v2'])
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.1 Scalar set point size
Setting s to a scalar makes all points the same size.
plt.scatter(x=df['v1'],y=df['v2'],s=100)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.2 Vector set point size
Using vectors to set the size of a point reflects three dimensions of information: horizontal, vertical and point size.
plt.scatter(x=df['v1'],y=df['v2'],s=df['target'] *100+50)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.3 Set the point shape
First of all, we selected several common point types and drew them as follows:
markers=['. '.', '.'o'.'v'.A '^'.'<'.'>'.'1'.'8'.'s'.'p'.'P'.The '*'.'H'.'h'.'+'.'x'.'X'.'D'.'d']# Commonly used scatter shapes
fig,axs=plt.subplots(4.5,figsize=(25.20))
for j in range(4) :for i in range(1.6):
axs[j][i-1].scatter(1.1,marker=markers[i-1+j*5],s=1000)
axs[j][i-1].text(x=1.01,y=1.01,s=str(markers[i-1+j*5]))
Copy the code
Here is an example of a replacement point pattern using only the data from this article
plt.scatter(x=df['v1'],y=df['v2'],s=100,marker=The '*')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.4 Change colors in scalar form
We set the scatter to red by saying c=’red’
plt.scatter(x=df['v1'],y=df['v2'],s=100,c='red')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.5 Set transparency
Alpha =0.5 Set transparency, it can be seen that it is more convenient to see the overlap of data after setting transparency
plt.scatter(x=df['v1'],y=df['v2'],s=100,c='red',alpha=0.4)
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.6 Add edge color
I added the edge color here just to look good
plt.scatter(x=df['v1'],y=df['v2'],c='red',s=100,alpha=0.6,edgecolor='black')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.1.7 Setting the Edge Width
It’s also for good looks.
plt.scatter(x=df['v1'],y=df['v2'],c='red',s=100,alpha=0.6,linewidths=2,edgecolor='black')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.2 Advanced rendering
1.2.1 Use chromatographic vectors to represent classification
plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=100,alpha=0.6,
linewidths=2,edgecolor='black',cmap='Set1')
plt.colorbar()
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.2.2 Use continuous color vectors
plt.scatter(x=df['v1'],y=df['v2'],c=df['v4'],s=100,alpha=0.7,
linewidths=2,edgecolor='black',cmap='Set1',vmin=min(df['v4']),vmax=max(df['v4']))
plt.colorbar()
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
plt.scatter(x=df['v1'],y=df['v2'],c=df['v4'],s=100,alpha=0.7,
linewidths=2,edgecolor='black',cmap='hsv',vmin=min(df['v4']),vmax=max(df['v4']))
plt.xlabel('v1')
plt.ylabel('v2')
plt.colorbar()
plt.show()
Copy the code
1.2.3 bubble chart
plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=abs(df['v4']) *80,alpha=0.6,
linewidths=2,edgecolor='black',cmap='Set1')
plt.xlabel('v1')
plt.ylabel('v2')
plt.show()
Copy the code
1.2.3 Legend and Label method 1
import matplotlib.patches as mpatches
Copy the code
plt.rcParams['font.sans-serif'] = ['fangsong'] # Step 1 (replace sans-Serif font)
plt.rcParams['font.size'] = 12 Set the font size
plt.rcParams['axes.unicode_minus'] = False # # # # # # # # # # # # # # #
plt.rcParams["axes.facecolor"] ="cornsilk"
Copy the code
plt.figure(figsize=(10.6))
plt.scatter(x=df['v1'],y=df['v2'],c=df['target'],s=abs(df['v4']) *80,alpha=0.6,
linewidths=2,edgecolor='black',cmap='Set1')
plt.xlabel('horizontal')
plt.ylabel('vertical')
plt.title('Bubble chart of fictitious data')
cmap1=plt.get_cmap('Set1')
colors=cmap1(np.arange(0.9.4))
labels=[mpatches.Patch(color=colors[i],label=i)for i in range(len(colors))]
plt.legend(handles=labels,loc='lower right')
plt.grid()
plt.show()
Copy the code
1 way 2
plt.figure(figsize=(10.6))
colors=['red'.'green'.'blue']
for class1 in df['target'].value_counts().index:
x=df[df['target']==class1]['v1']
y=df[df['target']==class1]['v2']
plt.scatter(x,y,c=colors[class1],s=100,alpha=0.6,
linewidths=2,edgecolor='black',label=class1)
plt.grid()
plt.xlabel('horizontal')
plt.ylabel('vertical')
plt.title('Bubble chart of fictitious data')
plt.legend()
plt.show()
Copy the code