Compiled from Numpy, heart of the machine.

NumPy is a scientific computing library that provides Python with high-performance vectors, matrices, and high-dimensional data structures. It is implemented in C and Fortran, so it has very good performance to establish equations with vectors and matrices and perform numerical calculations. NumPy is basically the basis for all the frameworks and packages that use Python for numerical computation, such as TensorFlow and PyTorch, and the most basic part of building a machine learning model is learning how to use NumPy to build computations.


Basic knowledge of

NumPy’s main operation is a homogenous multidimensional array, that is, a table of elements of the same type (usually numbers), all of which are indexed by tuples of positive integers. In NumPy, dimensions are also called axes.

For example, the coordinate point [1, 2, 1] has an axis. There are three points on this axis, so let’s say its length is 3. The following array has two axes, also of length 3.

[[1., 0., 0.], [0., 1.Copy the code

NumPy’s array class is called NDARray, and we also call it array. Note that numpy.array is different from the array.array class in the standard Python library. Array, a class in the standard Python library that deals only with one-dimensional arrays, provides a small amount of functionality. Ndarray also has a number of important properties:

  • Ndarray. nDIM: Displays the number of axes (or dimensions) in the array.
  • Ndarray. shape: Displays the size of the array in each dimension. For a matrix with n rows and m columns, its shape is (n,m).
> > > b = np. Array ([[1, 2, 3], [4 and 6]]) > > > b.s hape (2, 3)Copy the code

  • Ndarray. size: The total number of elements in an array, equivalent to the product of all elements in an array’s Shape. For example, the total number of elements in a matrix is the product of rows and columns.
>>> b = np.array([[1,2,3],[4,5,6]])
>>> b.size
6
Copy the code

  • Ndarray. dtype: Displays the type of the array element. The standard type function in Python can also be used to display array types. NumPy has its own types such as: Int32, numpy.int16, and numpy.float64, where “int” and “float” indicate whether the type of data is an integer or a floating-point number, and “32” and “16” indicate the number of bytes (storage size) in this array.
  • Ndarray. itemSize: The byte storage size of each element in the array. For example, an array of element type float64 has an itemsize of 8 (=64/8).
>>> import numpy as np
>>> a = np.arange(15).reshape(3, 5)
>>> a
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
>>> a.shape
(3, 5)
>>> a.ndim
2
>>> a.dtype.name
'int64'
>>> a.itemsize
8
>>> a.size
15
>>> type(a)
<type 'numpy.ndarray'>
>>> b = np.array([6, 7, 8])
>>> b
array([6, 7, 8])
>>> type(b)
<type 'numpy.ndarray'>
Copy the code


Create an array

NumPy has a number of ways to create arrays. For example, you can create a NumPy array using Python’s list, which generates the same array element type as the original sequence.

> > > import numpy as np > > > a = np array ([4] 2) > > > a array ([2, 3, 4]) > > > a. d. type dtype ('int64') > > > b = np. Array ([1.2, 3.5, 5.1]) > > > b.d type dtype ('float64')
Copy the code

A common error is to call array with multiple numeric arguments, when the correct method is to define a list of values as an argument to the array with “[]”.

> > > a = np. Array (1, 2, 3, 4)# WRONG> > > a = np. Array ([1, 2, 3, 4])# RIGHT
Copy the code

Array converts a sequence within a sequence to a two-dimensional array, a sequence within a sequence to a three-dimensional array, and so on.

>>> b = np.array([(1.5,2,3), (4,5,6)])
>>> b
array([[ 1.5,  2. ,  3. ],
       [ 4. ,  5. ,  6. ]])
Copy the code

The type of the array can also be specified at creation time:

> > > b = np. Array ([(1.5, 2, 3), (4 and 6)]) > > > c = np, array ([[1, 2], [3, 4]], dtype = complex) > > > c array ([[1. + 0. J, 2.+0.j], [ 3.+0.j, 4.+0.j]])Copy the code

The inner element of an array is initially unknown, but its size is known. Therefore, NumPy provides functions to create placeholder arrays with initial values, which reduces unnecessary array growth and computation costs.

The function zeros creates an array of all zeros, ones creates an array of all ones, and empty creates an array of random initial elements, depending on the state of memory. By default, the data type (DType) used to create arrays is float64.

> > > np. Zeros ((3, 4)) array ([[... 0, 0, 0, 0.], [... 0, 0, 0, 0.], [... 0, 0, 0, 0.]]) > > > np. The ones ((2 and 4), dtype=np.int16 )# dtype can also be specifiedarray([[[ 1, 1, 1, 1], [ 1, 1, 1, 1], [ 1, 1, 1, 1]], [[ 1, 1, 1, 1], [ 1, 1, 1, 1], [ 1, 1, 1, 1]]], Dtype =int16) >>> np.empty(2,3)# uninitialized, output may varyArray ([[3.73603959e-262, 6.02658058e-154, 6.55490914e-260], [5.30498948e-313, 3.1467330930307, 1.00000000e+000]])Copy the code

To create arrays, NumPy provides a function similar to range to create arrays: arange.

> > > np. Arange (10, 30, 5) array ([10, 15, 20, 25]) > > > np. Arange (0, 2, 0.3)# it accepts float argumentsArray ([0., 0.3, 0.6, 0.9, 1.2, 1.5, 1.8])Copy the code

When Arange uses floating-point arguments, because of the limitations of floating-point precision, arange cannot determine how many elements of the array it needs to create. In this case, the linspace function is used to better determine how many array elements need to be generated in the interval.

>>> from numpy import pi
>>> np.linspace( 0, 2, 9 )                 # 9 numbers from 0 to 2Array ([0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2.]) > > > x = np. Linspace (0, 2 * PI, 100).# useful to evaluate function at lots of points
>>> f = np.sin(x)
Copy the code

array, zeros, zeros_like, ones, ones_like, empty, empty_like, arange, linspace, numpy.random.rand, numpy.random.randn, Fromfunction, fromfile (these functions can also create arrays, you can try to explain when you have time)


The output array

When you output an array, NumPy displays the array in a similar way to a nested list. But printing an array to the screen follows the following layout:

  • The last axis is printed from left to right
  • The penultimate axis prints from top to bottom
  • The remaining axes are printed from top to bottom, and each block is separated by a blank line

As shown below, the output of a one-dimensional array is a row, a two-dimensional matrix, and a three-dimensional matrix list.

>>> a = np.arange(6)                         # 1d array
>>> print(a) [0, 1, 2, 3, 4, 5] > > > > > > b = np. Arange (12). Reshape (4, 3)# 2d array
>>> print(b) [[0, 1, 2] [3, 4, 5] [6, 7, 8] [9 10 11]] > > > > > > c = np. Arange (24). Reshape (2 and 4)# 3d array
>>> print(c)
[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]
 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
Copy the code

The 0 function used above specifies the number of columns and columns in the array and arranges all the elements by the specified number of dimensions. See next section for details. In array printing, if an array contains too many elements, NumPy automatically skips the middle of the array and prints only the two sides.

>>> print(np.arange(10000))
[   0    1    2 ..., 9997 9998 9999]
>>>
>>> print(Np.Arange (10000).0 01 2... 97 98 99] [100 101 102... 197 198 199] [200 201 202... 0 [9800 9801 9802... 9897 9898 9899] [9900 9901 9902... 9997, 9998, 9999]]Copy the code

If you want NumPy to print the entire array, you can change the output setting with set_printoptions.

>>> np.set_printoptions(threshold=np.nan)
Copy the code


Basic operation

The arithmetic operations in an array are usually element-level operations that result in a new array. Subtraction, addition, square, product of corresponding elements, and logical operations are all element-level operations as shown below.

> > > a = np. Array (,30,40,50 [20]) > > > b = np. Arange (4) > > > b array ([0, 1, 2, 3]) > > > c = a - > b > > c array ([20, 29, 38, 47) > > > b * * 2 array ([0, 1, 4, 9]) > > > 10 * np in sin (a) array ([9.12945251, 9.88031624, 7.4511316, -2.62374854]) >>> A <35 array([True, True, False, False])Copy the code

Unlike many scientific computing languages, where multiplication operators * or multiple functions are used for element-level multiplication in NumPy arrays, matrix multiplication can be performed using dot functions or methods.

> > > A = np. Array ([[1, 1],... [0, 1]]) > > > B = np. Array ([[2, 0],... [3, 4]]) > > > A * B# elementwise product
array([[2, 0],
       [0, 4]])
>>> A.dot(B)                    # matrix product
array([[5, 4],
       [3, 4]])
>>> np.dot(A, B)                # another matrix product
array([[5, 4],
       [3, 4]])
Copy the code

Some operations, such as += and *=, produce output that changes an existing array rather than creating a new array as described above.

. > > > a = np ones (2, 3), dtype = (int) > > > b = np. Random. The random ((2, 3)) > > > a * = 3 > > > a array ([[3, 3, 3], [3, 3, 3]) >>> B += A >>> B Array ([[3.417022, 3.72032449, 3.00011437], [3.30233257, 3.14675589, 3.09233859]]) >>> A += b# b is not automatically converted to integer type
Traceback (most recent call last):
  ...
TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
Copy the code

When you manipulate arrays of different data types, the resulting array type will generally be the same as the more general or precise array (this behavior is called Upcasting).

>>> a = np.ones(3, dtype=np.int32)
>>> b = np.linspace(0,pi,3)
>>> b.dtype.name
'float64'C = a + b > > > > > > c array ([1., 2.57079633, 4.14159265]) > > > c.d type. The name'float64'>>> D = np. Exp (c*1j) >>> D array([0.54030231+0.84147098 J, -0.84147098+0.54030231j, - 0.54030231-0.84147098 j]) > > > d.d type. The name'complex128'
Copy the code

Many unary operations, such as calculating the sum of all elements in an array, are methods of the Ndarray class.

>>> a = np.random. Random ((2,3)) >>> a array([[0.18626021, 0.34556073, 0.39676747], [0.53881673, 0.41919451, ]) >>> a.sum() 2.5718191614547998 >>> a.sum() 0.1862602113776709 >>> a.max() 0.6852195003967595Copy the code

By default, these operations treat the array as a sequence regardless of its shape. However, if you specify the axis parameter, you can specify which dimension to operate on. The following axis=0 will perform the operation for each column, for example, B. sum(axis=0) adds all elements of each column in matrix B to a scalar.

> > > b = np. Arange (12). Reshape (3, 4) > > > b array ([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]) > > > > > > b.s um (axis = 0)# sum of each column
array([12, 15, 18, 21])
>>>
>>> b.min(axis=1)                            # min of each row
array([0, 4, 8])
>>>
>>> b.cumsum(axis=1)                         # cumulative sum along each row
array([[ 0,  1,  3,  6],
       [ 4,  9, 15, 22],
       [ 8, 17, 27, 38]])
Copy the code


Indexing, interception, and iteration

One-dimensional arrays can be indexed, Slicing, and iterating, just like Python lists and tuples. Note that a[0:6:2] represents elements 1 through 6 and operates on the second element of every two.

>>> a = np.arange(10)**3
>>> a
array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])
>>> a[2]
8
>>> a[2:5]
array([ 8, 27, 64])
>>> a[:6:2] = -1000    # equivalent to a[0:6:2] = -1000; from start to position 6, exclusive, set every 2nd element to -1000
>>> a
array([-1000,     1, -1000,    27, -1000,   125,   216,   343,   512,   729])
>>> a[ : :-1]                                 # reversed a
array([  729,   512,   343,   216,   125, -1000,    27, -1000,     1, -1000])
>>> for i in a:
...     print(i**(1/3.))
...
nan
1.0
nan
3.0
nan
5.0
6.0
7.0
8.0
9.0
Copy the code

Multidimensional arrays can have an index on each axis. These indexes are separated by commas in tuples:

>>> def f(x,y):
...     return10*x+y ... > > > b = np. Fromfunction (f, (5, 4), dtype = int) > > > b array ([[0, 1, 2, 3], [10, 11, 12, 13], [20, 21, 22, 23], [30, 31, 32, 33], [40, 41, 42, 43]]) > > > [2, 3] 23 b > > > b [0:5, 1]# each row in the second column of b
array([ 1, 11, 21, 31, 41])
>>> b[ : ,1]                        # equivalent to the previous example
array([ 1, 11, 21, 31, 41])
>>> b[1:3, : ]                      # each column in the second and third row of b
array([[10, 11, 12, 13],
       [20, 21, 22, 23]])
Copy the code

When some dimensions have no specified index, the empty dimension defaults to take all elements.

>>> b[-1]                                  # the last row. Equivalent to b[-1,:]
array([40, 41, 42, 43])
Copy the code

As above, because the second dimension is omitted, b[I] represents the ith line of output. Of course we can also use ‘:’ to indicate omitted dimensions, for example, b[I] is equivalent to b[I, :]. In addition, NumPy allows the use of DOTS (…) Represents enough colons to build a complete index tuple.

For example, if x is a 5-dimensional array:

  • X] [1, 2,… Equals x [1, 2,,,,, :),
  • X […, 3] equals x [:, :, :, :, 3)
  • X [4,…, 5:] equals x [4,,,,,, 🙂
>>> c = np.array( [[[  0,  1,  2],               # a 3D array (two stacked 2D arrays). [10, 12, 13]... [[100101102], [110112113]...]]) > > > c.s. hape (2, 2, 3) > > > c [1]...# same as c[1,:,:] or c[1]
array([[100, 101, 102],
       [110, 112, 113]])
>>> c[...,2]                                   # same as c[:,:,2]
array([[  2,  13],
       [102, 113]])
Copy the code

The iterations in the multidimensional array are completed with reference to the first axis, and each of the following loops outputs a b[I] :

>>> for row in b:
...     print(row) ... [0 12 3] [10 11 12 13] [20 21 22 23] [30 31 32 33] [40 41 42 43]Copy the code

However, if you want to operate on each element of the array, you can use the flat method. Flat is an iterator that operates on all the elements of an array, as we’ll do element by element below.

>>> for element in b.flat:
...     print(element) ... 0 12 3 10 11 12 13 20 21 22 23 30 31 32 33 40 41 42 43Copy the code

Shape transformation

Change the shape of the array

The shape of an array is determined by the number of axes and their elements. It is typically represented by an integer tuple in which integers represent the number of elements in the corresponding dimension.

> > > a = np. Floor (10 * np. Random. The random ((3, 4))) > > > a array ([[2, 8, 0), 6], [4, 5, 1, 1], [8, 9., 3. 6.]]) >>> a.shape (3, 4)Copy the code

The shape of an array can be changed in many ways. For example, the following three methods can print a new array with a changed shape, and none of them will change the original array. 0 The 0 method is going to be used a lot in practice because we need to change the dimensions of the array to perform different operations

>>> a.ravel()  # returns the array, flattenedArray ([2, 8, 0), 6, 4, 5, 1, 1, 8, 9., 3., 6.]) > > > a.r eshape (6, 2)# returns the array with a modified shape
array([[ 2.,  8.],
       [ 0.,  6.],
       [ 4.,  5.],
       [ 1.,  1.],
       [ 8.,  9.],
       [ 3.,  6.]])
>>> a.T  # returns the array, transposed
array([[ 2.,  4.,  8.],
       [ 8.,  5.,  9.],
       [ 0.,  1.,  3.],
       [ 6.,  1.,  6.]])
>>> a.T.shape
(4, 3)
>>> a.shape
(3, 4)
Copy the code

Both ravel() and Flatten () reduce a multidimensional array by one dimension. Flatten () returns a new array and changes to it do not affect the original array, whereas ravel() returns a View and affects the original matrix.

In the transpose of the matrix, the dimensions of the rows and columns will be swapped, and each element in the matrix will be transformed symmetrically along the main diagonal. Additionally, 0 0 Returns a new array with modified dimensions as shown below, and the resize method will directly modify the dimensions of the original array itself.

> > > a array ([[2, 8, 0), 6], [4, 5, 1, 1], [8, 9., 3., 6.]]) > > > a.r esize ((2, 6)) > > > a array ([[2, 8, 0). [1., 1., 8., 9., 3.Copy the code

If a dimension is set to -1 in the Shape transform, the number of elements contained in that dimension is automatically calculated. As shown below, A has 12 elements in total, and after determining that there are three rows in total, -1 automatically calculates that four columns should be needed to arrange all the elements.

>>> a.reshape(3,-1)
array([[ 2.,  8.,  0.,  6.],
       [ 4.,  5.,  1.,  1.],
       [ 8.,  9.,  3.,  6.]])
Copy the code


An array of stack

Arrays can be stacked on different axes. As shown below, VStack will concatenate the two arrays in the second dimension (vertical), while HStack will concatenate the arrays in the first dimension (horizontal).

> > > a = np. Floor (10 * np. Random. The random ((2, 2))) > > > a array ([[8, 8], [0., 0.]]) > > > b = np. The floor (10 * np. Random. The random ((2, 2))) > > > b array ([[1, 8.], [0., 4.]]) >>> np.vstack((a,b)) array([[ 8., 8.], [ 0., 0.], [ 1., 8.], [ 0., 4.]]) >>> np.hstack((a,b)) array([[ 8., 8., [0., 0., 0., 4.]])Copy the code

The column_stack function stacks the columns of a one-dimensional array to a two-dimensional array, which is equivalent to the hstack function for a two-dimensional array.

>>> from numpy import newaxis
>>> np.column_stack((a,b))     # with 2D arrays
array([[ 8.,  8.,  1.,  8.],
       [ 0.,  0.,  0.,  4.]])
>>> a = np.array([4.,2.])
>>> b = np.array([3.,8.])
>>> np.column_stack((a,b))     # returns a 2D array
array([[ 4., 3.],
       [ 2., 8.]])
>>> np.hstack((a,b))           # the result is different
array([ 4., 2., 3., 8.])
>>> a[:,newaxis]               # this allows to have a 2D columns vector
array([[ 4.],
       [ 2.]])
>>> np.column_stack((a[:,newaxis],b[:,newaxis]))
array([[ 4.,  3.],
       [ 2.,  8.]])
>>> np.hstack((a[:,newaxis],b[:,newaxis]))   # the result is the same
array([[ 4.,  3.],
       [ 2.,  8.]])
Copy the code

Similar to column_stack, the row_stack function is equivalent to vstack in a two-dimensional array. In general, higher than two dimensions, hStack is stacked along the second dimension, vstack along the first dimension, and concatenate takes it one step further by stacking two arrays on any given dimension, which of course requires that all other dimensions are of equal length. Concatenate is used in many depth models, such as stacking of weight matrices or stacking of DenseNet feature graphs.

In complex cases, r_ and c_ can effectively help stack values along an axis when creating arrays, and they also allow arrays to be generated using range iteration “:”.

>>> NP. R_ [1:4,0,4] array([1, 2, 3, 0,4])Copy the code

When using arrays as arguments, r_ and C_ behave like vstack and Hstack by default, but like concatenate, they allow dimensions to be given to stack.


Break up the array

Using hsplit to split an array along the horizontal axis, we specify the number of arrays to output after the split, or specify which column to split the array:

> > > a = np. Floor (10 * np. Random. The random ((2, 12))) > > > a array ([[9., 5. 6., 3., 6, 8, 0), 7, 9, 7, 2, 7], [1, 4, 9., 2., 2., 1., 0., 6., 2., 2., 4., 0.]]) >>> np.hsplit(a,3)# Split a into 3[array([[ 9., 5., 6., 3.], [ 1., 4., 9., 2.]]), array([[ 6., 8., 0., 7.], [ 2., 1., 0., 6.]]), array([[ 9., 7., 2., [2., 2., 4., 0.]])] >>> np.hsplit(a,(3,4))# Split a after the third and the fourth column[array([[ 9., 5., 6.], [ 1., 4., 9.]]), array([[ 3.], [ 2.]]), array([[ 6., 8., 0., 7., 9., 7., 2., 7.], [ 2., 1., 0., []) [2.Copy the code

Vsplit splits along vertical axes, and array_split specifies which axis to split along.

Copy and views

When performing array operations or operations, it is often difficult for beginners to determine whether data has been copied to a new array or modified directly from the original data. This has a big impact on further operations, so sometimes we also need to copy content into the new variable memory, rather than just pointing the new variable to the old memory. Currently, there are three types of replication methods: non-copy memory, shallow copy, and deep copy.

Actually not copying

Simple tasks do not copy array objects or their data. Assigning variable A to b and then modifying variable B simultaneously modifies variable A. This general assignment method makes variables correlated.

>>> a = np.arange(12)
>>> b = a            # no new object is created
>>> b is a           # a and b are two names for the same ndarray object
True
>>> b.shape = 3,4    # changes the shape of a
>>> a.shape
(3, 4)
Copy the code

Pythan passes unspecified objects as references, so calling the function does not change the target identifier, nor does actual content copy occur.

>>> def f(x):
...     print(id(x))
...
>>> id(a)                           # id is a unique identifier of an object
148293216
>>> f(a)
148293216
Copy the code

View or shallow copy

Different array objects can share the same data, and the View method creates a new array object to view the same data. The following target identifiers of C and A are inconsistent, and changing the shape of one variable does not change the other. But the two arrays share all elements, so changing an element in one array also changes the corresponding element in the other array.

>>> c = a.view()
>>> c is a
False
>>> c.base is a                        # c is a view of the data owned by aTrue >>> lags. Owndata False >>> >>> c.shape = 2,6# a's shape doesn't change
>>> a.shape
(3, 4)
>>> c[0,4] = 1234                      # a's data changes
>>> a
array([[   0,    1,    2,    3],
       [1234,    5,    6,    7],
       [   8,    9,   10,   11]])
Copy the code

The output of the split array is one of its views. If we divide array A into subarray S, then S is a view of A. Modifying elements in S will also modify corresponding elements in A.

>>> s = a[ : , 1:3]     # spaces added for clarity; could also be written "s = a[:,1:3]"
>>> s[:] = 10           # s[:] is a view of s. Note the difference between s=10 and s[:]=10
>>> a
array([[   0,   10,   10,    3],
       [1234,   10,   10,    7],
       [   8,   10,   10,   11]])
Copy the code


Deep copy

The copy method makes a complete copy of the array and data. This assignment method makes the two variables have different array targets and the data is not shared.

>>> d = a.copy()                          # a new array object with new data is created
>>> d is a
False
>>> d.base is a                           # d doesn't share anything with aFalse > > > d [0, 0] = 9999 > > > a array ([[0, 10, 10, 3], [1234, 10, 10, 7], [8, 10, 10, 11]])Copy the code

Understand NumPy in depth

Broadcast mechanism

A very important feature of NumPy is the broadcast operation, which allows NumPy to extend operations between matrices. For example, it implicitly adjusts an array’s exception dimension to a dimension that matches another operator to achieve dimension compatibility. For example, if it is legal to add a matrix of dimension [3,2] to another matrix of dimension [3,1], NumPy will automatically extend the second matrix to the equivalent dimension.

To define whether two shapes are compatible, NumPy compares their dimensional sizes one by one, starting at the end. In this process, if the two have the same corresponding dimension, or one (or both) equals 1, the comparison continues until the most recent dimension. If these conditions are not met, the program will report an error.

A broadcast operation is shown below:

> > > a = np. Array ([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]). Reshape (3, 2) > > > b = np. Array ([3.0]) > > > a * b array ([[3, 6], [9, 12.]. [15, 18]])Copy the code


Senior indexes

NumPy provides more indexing methods than regular Python sequences. In addition to the integer and truncated indexes seen earlier, arrays can be indexed by integer arrays and Boolean arrays.


By array index

We can index the elements in the middle of array A by array I and j, where the output array holds the shape of the index.

>>> a = np.arange(12)**2                       # the first 12 square numbers>>> I = np.array([1,1,3,8,5])# an array of indices
>>> a[i]                                       # the elements of a at the positions i
array([ 1,  1,  9, 64, 25])

>>> j = np.array( [ [ 3, 4], [ 9, 7 ] ] )      # a bidimensional array of indices
>>> a[j]                                       # the same shape as j
array([[ 9, 16],
       [81, 49]])
Copy the code

When using a multidimensional array as an index, each dimension is indexed once and sorted by the shape of the index. The following code shows this type of indexing, in which the element in the array Image represents the pixels of the color corresponding to the index, which can be seen as a simple palette.

Palette = np.array([[0,0,0],# black. [0, 255].# red. (0255, 0),# green. ,0,255 [0],# blue. [255255255]])# white
>>> image = np.array( [ [ 0, 1, 2, 0 ],           # each value corresponds to a color in the palette. [ 0, 3, 4, 0 ] ] ) >>> palette[image]# the (2,4,3) color image
array([[[  0,   0,   0],
        [255,   0,   0],
        [  0, 255,   0],
        [  0,   0,   0]],
       [[  0,   0,   0],
        [  0,   0, 255],
        [255, 255, 255],
        [  0,   0,   0]]])
       [81, 49]])
Copy the code

We can also get elements in an array using a multidimensional index, where each dimension must have the same shape. The following multidimensional arrays I and j can be used as parameters of the first dimension and the second dimension in index A respectively. For example, a[I, j] extracts an element from I and j respectively as parameter of the element in index A.

> > > a = np. Arange (12). Reshape (3, 4) > > > a array ([[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]) >>> I = np.array([0,1],# indices for the first dim of a. [1,2]]) >>> j = np.array([[2,1],# indices for the second dim. [3,3]]) >>> >>> a[I,j]# i and j must have equal shape
array([[ 2,  5],
       [ 7, 11]])
>>>
>>> a[i,2]
array([[ 2,  6],
       [ 6, 10]])
>>>
>>> a[:,j]                                     # i.e., a[ : , j]
array([[[ 2,  1],
        [ 3,  3]],
       [[ 6,  5],
        [ 7,  7]],
       [[10,  9],
        [11, 11]]])
Copy the code

Similarly, we put I and j in a sequence and use it as an index:

>>> l = [i,j]
>>> a[l]                                       # equivalent to a[i,j]
array([[ 2,  5],
       [ 7, 11]])
Copy the code

However, we cannot put I and j in an array as above, because the array would be understood as the first dimension of index A.

>>> s = np.array( [i,j] )
>>> a[s]                                       # not what we want
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IndexError: index (3) out of range (0<=index<=2) in dimension 0
>>>
>>> a[tuple(s)]                                # same as a[i,j]
array([[ 2,  5],
       [ 7, 11]])
Copy the code

Another common way to use an array as an index is to search for the maximum value of a time series:

>>> time = np.linspace(20, 145, 5)                 # time scale> > > data = np. Sin (np) arange (20)). Reshape (5, 4)# 4 time-dependent series>>> Data Array ([[0., 0.84147098, 0.90929743, 0.14112001], [-0.7568025, -0.95892427, -0.2794155, 0.6569866], [0.98935825, 0.41211849, -0.54402111, -0.99999021], [-0.53657292, 0.42016704, 0.99060736, 0.65028784], [-0.28790332, -0.96139749, -0.75098725, 0.14987721]]) >>> >>> ind = data.argmax(axis=0)# index of the maxima for each series
>>> ind
array([2, 0, 3, 1])
>>>
>>> time_max = time[ind]                       # times corresponding to the maxima
>>>
>>> data_max = data[ind, range(data.shape[1])] # => data[ind[0],0], data[ind[1],1]...>>> >>> time_max array([82.5, 20., 113.75, 51.25]) >>> data_max array([0.98935825, 0.84147098, 0.99060736, 0.6569866]) > > > > > > np. All (data_max = = data. The Max (axis = 0)) TrueCopy the code

You can also use an array index as an allocation target:

> > > a = np. Arange (5) > > > a array ([0, 1, 2, 3, 4]) > > > a [[1 4]] = 0 > > > a array ([0, 0, 2, 0, 0])Copy the code

However, when there are duplicates in the index list, the assignment task is executed multiple times, keeping the last result.

> > > a = np. Arange (5) > > > a [[0,0,2]] = [1, 2, 3] > > > a array ([2, 1, 3, 3, 4])Copy the code

This is reasonable, but note that if you use Python’s += creation, you may not get the expected result:

> > > a = np. Arange (5) > > > a [[0,0,2]] + = 1 > > > a array ([1, 1, 3, 3, 4])Copy the code

Although 0 appears twice in the indexed list, the 0th element is incremented only once. This is because “a+=1” equals “a = a+ 1” in Python.


Use Boolean arrays for indexes

When we index an array element, we are providing a list of indexes. But Boolean indexes are different, and we need to clearly choose which elements in the indexed array we want and which we don’t.

The Boolean index should use an array of booleans with the same shape as the original array. The following output is True only if the value is greater than 4, and the resulting array of booleans is used as the index.

. > > > a = np arange (12). Reshape (3, 4) > > > b = 4 > > > > b# b is a boolean with a's shape
array([[False, False, False, False],
       [False,  True,  True,  True],
       [ True,  True,  True,  True]])
>>> a[b]                                       # 1d array with the selected elements
array([ 5,  6,  7,  8,  9, 10, 11])
Copy the code

This property is very useful in tasks, such as ReLu activation functions, where only an activation value greater than 0 is printed, so we can implement ReLu activation functions in this way.

>>> a[b] = 0                                   # All elements of 'a' higher than 4 become 0
>>> a
array([[0, 1, 2, 3],
       [4, 0, 0, 0],
       [0, 0, 0, 0]])
Copy the code

The second way to use Boolean indexes is more similar to integer indexes; In each dimension of the array, we use a one-dimensional Boolean array to select the intercepts we want:

> > > a = np. Arange (12). Reshape (3, 4) > > > b1. = np array ([False, True, True])# first dim selection
>>> b2 = np.array([True,False,True,False])       # second dim selection
>>>
>>> a[b1,:]                                   # selecting rows
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>>
>>> a[b1]                                     # same thing
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>>
>>> a[:,b2]                                   # selecting columns
array([[ 0,  2],
       [ 4,  6],
       [ 8, 10]])
>>>
>>> a[b1,b2]                                  # a weird thing to do
array([ 4, 10])
Copy the code

Note that the length of the one-dimensional Boolean array must be the same as the length of the axis you want to intercept. In the example above, the lengths of B1 3 and B2 4 correspond to the first and second dimensions of A, respectively.


Linear algebra

Simple array operations

The following only shows simple matrix operations more detailed methods can be encountered in practice in the lookup API. The basic operations of matrix transpose, inverse, identity matrix, matrix multiplication, trace of matrix, solving linear equations and finding eigenvectors are shown as follows:

> > > import numpy as np > > > a = np array ([[1.0, 2.0], [3.0, 4.0]]) > > >print(a) [[1. 2.] [3, 4]] > > > a.t ranspose () array ([[1, 3], [2, 4]]) > > > np. Linalg. Inv (a) array ([[- 2, 1], [1.5, -0.5]]) >>> U = np.eye(2)# unit 2x2 matrix; "eye" represents "I"> > > u array ([[1, 0.], [0. 1.]]) > > > j = np array ([[0.0, 1.0], [1.0, 0.0]]) > > > np. Dot (j, j)# matrix product
array([[-1.,  0.],
       [ 0., -1.]])

>>> np.trace(u)  # trace2.0 > > > y = np. Array ([[5], [7]]) > > > np. Linalg. Solve (a, y) array ([[- (3)], [4]]) > > > np. Linalg. Eig sells its (j) (array ([0, + 1. J, 0. 1. J]), array ([[0.70710678 0. 0.70710678 + 0. J, j], [0.00000000 0.70710678 j, 0.00000000 + 0.70710678 j]])) Parameters: square matrix Returns The eigenvalues, each repeated according to its multiplicity. The normalized (unit"length") eigenvectors, such that the
    column ``v[:,i]`` is the eigenvector corresponding to the
    eigenvalue ``w[i]`` .
Copy the code

  • NumPy basics for data science beginners
  • NumPy commonly uses large summaries for tracing from arrays to matrices

The original document links: docs.scipy.org/doc/numpy/u…