Article series address

  • NumPy array
  • NumPy Tutorial (2) : Data types
  • NumPy tutorial (3) : NDARray internals and advanced iterations

The internal mechanism of nDARray objects

In the previous section, we have covered the use of NDARRay in detail. At the beginning of this chapter, we will talk about the internal mechanism of NDARray to better understand the following content.

1, the composition of NDARray

Unlike arrays, Ndarray contains not only data information, but also other descriptive information. Ndarray consists of the following contents:

  • Data pointer: A pointer to actual data.
  • Data type (DType) : Describes the number of bytes per element.
  • Shape: A tuple representing the shape of an array.
  • Strides: A number of bytes representing the current position in the next dimension as it advances from the current dimension.

In NumPy, data is stored in a uniformly contiguous block of memory. NumPy stores multi-dimensional arrays internally as one-dimensional arrays. As long as we know the number of bytes per element (dtype) and the number of elements in each dimension (shape), You can quickly locate any element in any dimension.

Strides for DType and Shape have been described in detail in previous articles.

The sample

ls = [[[1.2.3.4], [5.6.7.8], [9.10.11.12]], 
      [[13.14.15.16], [17.18.19.20], [21.22.23.24]]]
a = np.array(ls, dtype=int)
print(a)
print(a.strides)
Copy the code

Output:

[[[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]]

 [[13 14 15 16]
  [17 18 19 20]
  [21 22 23 24]]]
(48.16.4)
Copy the code

In the example above, we define a three-dimensional array, dtype is int, int is 4 bytes. The first dimension, from element 1 to element 13, is 12 elements apart and the total number of bytes is 48; The second dimension, from element 1 to element 5, is separated by 4 elements, and the total number of bytes is 16; The third dimension, from element 1 to element 2, is separated by one element, and the total number of bytes is 4. So the span is 48, 16, 4.

Ordinary iteration

Normal iteration in Ndarray is just like iteration in Python and other languages, n-dimensional arrays, with n-layer for loops.

Example:

import numpy as np

ls = [[1.2], [3.4], [5.6]]
a = np.array(ls, dtype=int)
for row in a:
    for cell in row:
        print(cell)
Copy the code

Output:

One, two, three, four, five, sixCopy the code

In the above example, the row data type is still numpy.ndarray and the cell data type is numpy.int32.

Nditer multidimensional iterator

NumPy provides an efficient multi-dimensional iterator object: nditer for iterating over arrays. In the normal way of iterating, an n-dimensional array, we use n-level for loops. But with the NDiter iterator, a for loop can traverse the entire array. Since NDARray is contiguous in memory, contiguous memory is equivalent to a one-dimensional array. Traversing a one-dimensional array of course only requires a for loop.

1. Basic examples

Example 1:

ls = [[[1.2.3.4], [5.6.7.8], [9.10.11.12]],
      [[13.14.15.16], [17.18.19.20], [21.22.23.24]]]
a = np.array(ls, dtype=int)
for x in np.nditer(a):
    print(x, end=",")
Copy the code

Output:

1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23.24.Copy the code

The order argument: specifies the order in which elements are accessed

When you create an Ndarray array, you can specify the order of the elements by row or by column using the order argument. Consider the following example:

Example 2:

ls = [[[1.2.3.4], [5.6.7.8], [9.10.11.12]],
      [[13.14.15.16], [17.18.19.20], [21.22.23.24]]]
a = np.array(ls, dtype=int, order='F')
for x in np.nditer(a):
    print(x, end=",")
Copy the code

Output:

1, 13, 5, 17, 9, 21, 2, 14, 6, 18, 10, 22, 3, 15, 7, 19, 11, 23, 4, 16, 8, 20, 12, 24, 
Copy the code

By default, nditer accesses elements in the order of the elements in memory (order=’K’).

Example 3: NdITER can also specify the use of some sort of sequential traversal.

ls = [[[1.2.3.4], [5.6.7.8], [9.10.11.12]],
      [[13.14.15.16], [17.18.19.20], [21.22.23.24]]]
a = np.array(ls, dtype=int, order='F')
for x in np.nditer(a, order='C'):
    print(x, end=",")
Copy the code

Output:

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 
Copy the code

Line the main order (order = ‘C’) and list the main sequence (order = ‘F’), see en.wikipedia.org/wiki/Row-_a… . Example 1 is row primary order, example 2 is column primary order, and if you think of the NDARray array as a tree, you’ll see that row primary order is depth-first, and column primary order is breadth-first. The main reason for the branch-primary and column-primary order in NumPy is to improve performance in matrix operations, where sequential access is orders of magnitude faster than non-sequential access. (Matrix operations will be covered in a later chapter)

3. Op_flags parameter: Change the value of the element during iteration

By default, nditer treats the array to be iterated over as a readonly object. In order for the array elements to be worth modifying while iterating through the array, op_flags must be either readwrite or writeONLY.

Example 4:

import numpy as np

a = np.arange(5)
for x in np.nditer(a, op_flags=['readwrite']):
    x[...] = 2 * x
print(a)
Copy the code

Output:

[0 1 2 3 4]
Copy the code

4. Flags parameters

Flags =[‘f_index’, ‘external_loop’]; flags=[‘f_index’, ‘external_loop’];

(1) Use external loop: external_loop

Transfers the innermost loop in one dimension to the outer loop iterator so thatNumPyVectorization becomes more efficient when dealing with larger data volumes.

Simply put, when flags=[‘external_loop’] is specified, a one-dimensional array is returned instead of a single element. Specifically, if the order of nDARRay is the same as the order of traversal, all elements are returned as a one-dimensional array; When the order of nDARray and traversal is inconsistent, return the one-dimensional array for each traversal.

Example 5:

import numpy as np

ls = [[[1.2.3.4], [5.6.7.8], [9.10.11.12]],
      [[13.14.15.16], [17.18.19.20], [21.22.23.24]]]
a = np.array(ls, dtype=int, order='C')
for x in np.nditer(a, flags=['external_loop'], order='C'):
    print(x,)
Copy the code

Output:

[12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]Copy the code

Example 6:

b = np.array(ls, dtype=int, order='F')
for x in np.nditer(b, flags=['external_loop'], order='C'):
    print(x,)
Copy the code

Output:

[1 23 4] [5 6 7 8] [9 10 11 12] [13 14 15 16] [17 18 19 20] [21 22 23 24]Copy the code

(2) Trace index: C_INDEX, f_index, multi_index

Example 7:

import numpy as np

a = np.arange(6).reshape(2.3)
it = np.nditer(a, flags=['f_index'])

while not it.finished:
    print("%d <%d>" % (it[0], it.index))
    it.iternext()
Copy the code

Output:

0 <0>
1 <2>
2 <4>
3 <1>
4 <3>
5 <5>
Copy the code

The index is in this order because we chose the column index (f_index). See the picture below for an intuitive feeling:

The order in which the elements are traversed is determined by the order argument, and the row index (c_index) and column index (f_index), however specified, do not affect the order in which the elements are returned. They simply indicate what the subscripts of each element should be if returned in row/column order in the current memory order.

Example 8:

import numpy as np

a = np.arange(6).reshape(2.3)
it = np.nditer(a, flags=['multi_index'])

while not it.finished:
    print("%d <%s>" % (it[0], it.multi_index))
    it.iternext()
Copy the code

Output:

1 < < 0 (0, 0) > (0, 1) > 2 < (0, 2) > 3 < (1, 0) > 4 < (1, 1) > 5 < > (1, 2)Copy the code

5. Iterate over multiple arrays simultaneously

When it comes to iterating through multiple arrays at once, the first thing that comes to mind is the ZIP function, which is not required in NDiter.

Example 9:

a = np.array([1.2.3], dtype=int, order='C')
b = np.array([11.12.13], dtype=int, order='C')
for x, y in np.nditer([a, b]):
    print(x, y)
Copy the code

Output:

1, 11, 2, 12, 3, 13Copy the code

Other functions

1. Flatten function

The flatten function returns a one-dimensional NDARray expanded from a multidimensional NDARray. Grammar:

flatten(order='C')
Copy the code

Example:

import numpy as np

a = np.array([[1.2.3], [4.5.6]], dtype=int, order='C')
b = a.flatten()
print(b)
print(type(b))
Copy the code

Output:

[1 2 3 4 5 6]
<class 'numpy.ndarray'>
Copy the code

2, flat

Flat returns an iterator that iterates through each element in the array.

import numpy as np

a = np.array([[1.2.3], [4.5.6]], dtype=int, order='C')
for b in a.flat:
    print(b)
print(type(a.flat))
Copy the code

Output:

1
2
3
4
5
6
<class 'numpy.flatiter'>
Copy the code

The Path of Python for older code farmers