Small knowledge, big challenge! This article is participating in the creation activity of “Essential Tips for Programmers”.

This article has participated in the “Digitalstar Project” and won a creative gift package to challenge the creative incentive money.

Matplotlib library

Matplotlib is Python’s drawing library. It provides a full set of matlab-like command apis for generating publish-quality graphics. Matplotlib makes drawing very simple, striking an excellent balance between ease of use and performance.

The graph

Graph drawing

Hello World, as a drawing program, will start by drawing a simple curve. It also gives a brief introduction to how Matplotlib works.

# plot_1.py
import matplotlib.pyplot as plt
x = range(50)
y = [value * 2 for value in x]
plt.plot(x, y)
plt.show()
Copy the code

The above code will draw the curve y=2*x, where x is in the range [0,50], as follows:

You can see that the window also contains multiple ICONS at the top, including:

project Value
This button is used to save the drawing as an image in the required format, including PNG, JPG, PDF, SVG and other common formats
This button is used to adjust the size, margins and other image properties
This button is used to zoom in and out the picture and observe the details of the picture. After clicking this button, drag the left mouse button to zoom in and drag the right mouse button to zoom in
This button is used to move a graph and can be combined with the “Zoom” button to see the details of the enlarged image. At the same time, after clicking this button, you can use the right mouse button to drag and drop the graph to scale the coordinate axis
This button is used to restore the graph to its initial state, unzooming, moving, and so on

Plot (x, y) is used to plot a curve where the x coordinates of the curve points are given in list X and the y coordinates of the curve points are given in list Y.

Since Matplotlib only focuses on drawing, if you want to read input from a file or do some intermediate calculations, you must use Python modules, but don’t worry, Matplotlib is compatible with other modules and doesn’t involve too much trickiness. For example, to generate a large number of statistical graphs, you might need to use scientific computing packages such as Numpy and Python’s file read I/O module. Examples are provided in the following sections.

Use the Numpy library to draw a graph

Draw the curve cos(x), x in the interval [0, 2* PI] :

# cos_1.py
import math
import matplotlib.pyplot as plt
scale = range(100)
x = [(2 * math.pi * i) / len(scale) for i in scale]
y = [math.cos(i) for i in x]
plt.plot(x, y)
plt.show()
Copy the code

With the Numpy library, the following equivalent code can be used:

# cos_2.py
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0.2 * np.pi, 100)
y = np.cos(x)
plt.plot(x, y)
plt.show()
Copy the code

The graph is as follows:

Tips: While Numpy is not necessary for visualization, it can be seen that Numpy libraries can be used more efficiently.

Numpy can operate on the entire array at once, making the code more efficient. For example, draw the curve y=x3+5x−10y=x^3+5x-10y=x3+5x−10 in [-10,10] :

# plot_np.py
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-10.10.800)
y = x ** 3 + 5 * x - 10
plt.plot(x, y)
plt.show()
Copy the code

Draw the graph as follows

Draw multiple graphs

Many times we need to compare multiple sets of data in order to find similarities and differences between the data. In this case, we need to draw multiple curves on the same picture — multiple curves. The following graph shows the functions y=xy=xy=x, y=x2y=x^2y=x2 on the same picture. Y =logexy=log_exy=logex and y=sin(x)y=sin(x)y=sin(x) :

# plot_multi_curve.py
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0.1.2 * np.pi, 100)
y_1 = x
y_2 = np.square(x)
y_3 = np.log(x)
y_4 = np.sin(x)
plt.plot(x,y_1)
plt.plot(x,y_2)
plt.plot(x,y_3)
plt.plot(x,y_4)
plt.show()
Copy the code

The above script draws the following graph:

Tips: To plot a curve, plt.plot() is called once, whereas plt.show() is called once. This delayed rendering mechanism is at the heart of Matplotlib, where we can declare to draw the graphics at any time, but only render the display graphics when plt.show() is called.

To better illustrate this delayed rendering mechanism, write the following code:

# deferred_rendering.py
import numpy as np
import matplotlib.pyplot as plt
def plot_func(x, y) :
    x_s = x[1:] - y[:-1]
    y_s = y[1:] - x[:-1]
    plt.plot(x[1:], x_s / y_s)
x = np.linspace(-5.5.200)
y = np.exp(-x ** 2)
plt.plot(x, y)
plot_func(x, y)
plt.show()
Copy the code

Draw the graph as follows:

As you can see, although one of the plt.plot() calls is in the plot_func function, it has no effect on the rendering of the graph because plt.plot() just declares what we want to render, but has not yet performed the rendering. Therefore, this feature can be used in combination with syntax such as for loop and conditional judgment to complete the drawing of complex graphs, and can also combine different types of statistical graphs in the same graph.

Read data files and draw graphs

In many cases, data is stored in a file. Therefore, you need to read the data in the file first and then draw it. For example, the.txt file can be read by using libraries such as PANDAS and NUMpy for other files. Suppose there is a data. TXT file as follows:

0 1
1 2
2 5
4 17
5 26
6 37
Copy the code

The code for reading the data and drawing is as follows:

# read_txt.py
import matplotlib.pyplot as plt
x, y = [], []
for line in open('data.txt'.'r'):
    values = [float(s) for s in line.split()]
    x.append(values[0])
    y.append(values[1])
plt.plot(x, y)
plt.show()
Copy the code

If the Numpy library is used, the equivalent code can be written as follows:

import matplotlib.pyplot as plt
import numpy as np
data = np.loadtxt('data.txt')
plt.plot(data[:,0], data[:,1])
plt.show()
Copy the code

A scatter diagram

When plotting a graph, we assume a sequential relationship between points. A scatter plot is a simple plot of points with no connection between them.

import numpy as np
import matplotlib.pyplot as plt
data = np.random.rand(1000.2)
plt.scatter(data[:,0], data[:,1])
plt.show()
Copy the code

Tips: The function plt.scatter() is called in exactly the same way as plt.plot(), taking the x and y coordinates of the points as input parameters.

The bar chart

The bar chart has rich forms of expression, common types include single group bar chart, multiple group bar chart, accumulation bar chart and symmetrical bar chart. So this is covered in detail in Python-Matplotlib Drawing bar Charts.

The pie chart

Pie charts can be used to compare relative quantities:

import matplotlib.pyplot as plt
data = [10.15.30.20]
plt.pie(data)
plt.show()
Copy the code

The Tips: plt.pie() function takes a series of values as input and passes them to Matplolib, which automatically calculates and plots the relative area of each value in the pie chart.

histogram

A histogram is a graphical representation of a probability distribution. In fact, a histogram is just a special kind of bar chart. We can easily use Matplotlib’s bar graph function and do some statistical operations to generate histograms. However, histograms are very useful, so Matplotlib provides a more convenient function:

import numpy as np
import matplotlib.pyplot as plt
x = np.random.randn(1024)
plt.hist(x, bins = 20)
plt.show()
Copy the code

The plt.hist() function takes a list of values as input. Ranges of values are divided into ranges of equal size (the default number is 10), and a bar graph is generated, with a range corresponding to a bar, the height of which is the number of values in the corresponding range, and the number of bars determined by the optional parameter bins.

Box figure

Box plots can compare the distribution of values by conveniently displaying the median, quartile, maximum, and minimum values of a set of values.

import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(200)
plt.boxplot(data)
plt.show()
Copy the code

Tips: The plt.boxplot() function takes a set of values and automatically calculates the mean, median, and other statistics.

Box diagram description:

  1. The yellow line is the median of the distribution.
  2. The square box box contains 50% of the data from the lower quartile Q1 to the upper quartile Q3.
  3. The lower quartile of the lower box whisker extends to 1.5(Q3-Q1).
  4. Upper box shall extend from upper quartile to 1.5(Q3-Q1).
  5. Numbers further away from the box whisker are circled.

To draw multiple box plots in a single graph, it is not feasible to call plt.boxplot() once on each boxplot. It would draw all the boxes together into a jumbled, unreadable shape. To achieve the desired effect, simply draw multiple box plots simultaneously in a single call to plt.boxplot(), as shown below:

import numpy as np
import matplotlib.pyplot as plt
data = np.random.randn(200.6)
plt.boxplot(data)
plt.show()
Copy the code

Triangular grid

Grid diagrams appear when working with spatial locations. In addition to showing distances and neighborhood relationships between points, triangular grid diagrams are also a convenient way to represent maps.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.tri as tri
data = np.random.rand(200.2)
triangles = tri.Triangulation(data[:,0], data[:,1])
plt.triplot(triangles)
plt.show()
Copy the code

Tips: The matplotlib.tri module is imported into the code, which provides helper functions for calculating triangular meshes from points.