Introduction to the

A normal array is an array that contains objects of the same type. A structured array is a format for storing different objects in an exponential group.

Today we’ll take a closer look at structured arrays in NumPy.

Field in a structured array

Because structured arrays contain different types of objects, each object type is called a field.

Each field has three parts: a name of type string, a type of any valid dType, and an optional title.

Let’s look at an example of building dType using filed:

In [165]: np.dtype([('name'.'U10'), ('age'.'i4'), ('weight'.'f4')])
Out[165]: dtype([('name'.'<U10'), ('age'.'<i4'), ('weight'.'<f4')])
We can build a new array using the dtype type above:

In [166]: x = np.array([('Rex'.9.81.0), ('Fido'.3.27.0)],
     ...:     dtype=[('name'.'U10'), ('age'.'i4'), ('weight'.'f4')])

In [167]: x
array([('Rex'.9.81.), ('Fido'.3.27.)],
      dtype=[('name'.'<U10'), ('age'.'<i4'), ('weight'.'<f4')])
X is a 1-dimensional array with each element containing three fields, name, age, and weight. And specify their data types.

A row of data can be accessed via index:

In [168]: x[1]
Out[168] : ('Fido'.3.27.)
You can also access a column of data by name:

In [170]: x['name']
Out[170]: array(['Rex'.'Fido'], dtype='<U10')
It is also possible to assign a uniform value to all columns:

In [171]: x['age']
Out[171]: array([9.3], dtype=int32)

In [172]: x['age'] = 10

In [173]: x
array([('Rex'.10.81.), ('Fido'.10.27.)],
      dtype=[('name'.'<U10'), ('age'.'<i4'), ('weight'.'<f4')])
Structured data type

The above example gives us a basic idea of structured data types. A structured data type is a collection of fields.

Create structured data types

Structured data types are created from base types in the following ways:

Created from a tuple

Each tuple is of the format (FieldName, Datatype, Shape), where shape is optional. Fieldname is the title of the field.

In [174]: np.dtype([('x'.'f4'), ('y', np.float32), ('z'.'f4', (2.2))])
If fieldName is null, it will be created with an F by default.

In [177]: np.dtype([('x'.'f4'), (' '.'i4'), ('z'.'i8')])
Out[177]: dtype([('x'.'<f4'), ('f1'.'<i4'), ('z'.'<i8')])
Created from comma-separated dtype

Optionally created from comma-separated dTYPE:

In [178]: np.dtype('i8, f4, S3')
Out[178]: dtype([('f0'.'<i8'), ('f1'.'<f4'), ('f2'.'S3')])

In [179]: np.dtype('3int8, float32, (2, 3)float64')
Create from a dictionary

Create from the dictionary like this: {‘names’:… , ‘formats’: … , ‘offsets’: … , ‘titles’: … , ‘itemsize’: … }

This can specify the name list and formats list.

Offsets refer to the Byte offsets for each field. Titles is the title of the field and ItemSize is the size of the entire DType.

In [180]: np.dtype({'names': ['col1'.'col2'].'formats': ['i4'.'f4']})
Out[180]: dtype([('col1'.'<i4'), ('col2'.'<f4')])

In [181]: np.dtype({'names': ['col1'.'col2'],... :...'formats': ['i4'.'f4'],... :...'offsets': [0.4],... :...'itemsize': 12})
Out[181]: dtype({'names': ['col1'.'col2'].'formats': ['<i4'.'<f4'].'offsets': [0.4].'itemsize':12})
Work with structured data types

Properties of structured data types can be accessed through the names and fields of dTYPE:

>>> d = np.dtype([('x'.'i8'), ('y'.'f4')])
>>> d.names
>>> d.fields
mappingproxy({'x': (dtype('int64'), 0), 'y': (dtype('float32'), 8)})
Offsets and Alignment

For structured types, because there are multiple data types in a single DTYPE, these data types are not aligned by default.

We can look at the various types of offsets in the following example:

>>> def print_offsets(d) :
.    print("offsets:", [d.fields[name][1] for name in d.names])
.    print("itemsize:", d.itemsize)
>>> print_offsets(np.dtype('u1, u1, i4, u1, i8, u2'))
offsets: []
itemsize: 17
If align=True is specified when the dtype is created, the types may be aligned according to the c-struct structure.

Alignment has the advantage of increasing processing efficiency. Let’s look at an example of alignment:

>>> print_offsets(np.dtype('u1, u1, i4, u1, i8, u2', align=True))
offsets: []
itemsize: 32
Field Titles

Each Filed can contain a title in addition to name.

There are two ways to specify the title. The first way is:

In [182]: np.dtype([(('my title'.'name'), 'f4')])
Out[182]: dtype([(('my title'.'name'), '<f4')])
The second way:

In [183]: np.dtype({'name': ('i4'.0.'my title')})
Out[183]: dtype([(('my title'.'name'), '<i4')])
Take a look at the structure of fields:

In [187]: d.fields
mappingproxy({'my title': (dtype('float32'), 0.'my title'),
              'name': (dtype('float32'), 0.'my title')})
Structured array

After creating structured arrays from structured data types, we can manipulate structured arrays.

The assignment

We can assign a structured array from a tuple:

>>> x = np.array([(1.2.3), (4.5.6)], dtype='i8, f4, f8')
>>> x[1] = (7.8.9)
>>> x
array([(1.2..3.), (7.8..9.)],
     dtype=[('f0'.'<i8'), ('f1'.'<f4'), ('f2'.'<f8')])
We can also assign structured arrays from scalars:

>>> x = np.zeros(2, dtype='i8, f4, ? , S1')
>>> x[:] = 3
>>> x
array([(3.3..True.b'3'), (3.3..True.b'3')],
      dtype=[('f0'.'<i8'), ('f1'.'<f4'), ('f2'.'? '), ('f3'.'S1')])
>>> x[:] = np.arange(2)
>>> x
array([(0.0..False.b'0'), (1.1..True.b'1')],
      dtype=[('f0'.'<i8'), ('f1'.'<f4'), ('f2'.'? '), ('f3'.'S1')])
A structured array can also be assigned to an unstructured array, but only if the structured array has a filed:

>>> twofield = np.zeros(2, dtype=[('A'.'i4'), ('B'.'i4')])
>>> onefield = np.zeros(2, dtype=[('A'.'i4')])
>>> nostruct = np.zeros(2, dtype='i4')
>>> nostruct[:] = twofield
Traceback (most recent call last):
TypeError: Cannot cast array data from dtype([('A'.'<i4'), ('B'.'<i4')]) to dtype('int32') according to the rule 'unsafe'
Structured arrays can also be assigned to each other:

>>> a = np.zeros(3, dtype=[('a'.'i8'), ('b'.'f4'), ('c'.'S3')])
>>> b = np.ones(3, dtype=[('x'.'f4'), ('y'.'S3'), ('z'.'O')])
>>> b[:] = a
>>> b
array([(0..B '0.0'.b''), (0..B '0.0'.b''), (0..B '0.0'.b'')],
      dtype=[('x'.'<f4'), ('y'.'S3'), ('z'.'O')])
Accessing structured arrays

Filed names can be used to access and modify a list of data:

>>> x = np.array([(1.2), (3.4)], dtype=[('foo'.'i8'), ('bar'.'f4')])
>>> x['foo']
>>> x['foo'] = 10
>>> x
array([(10.2.), (10.4.)],
      dtype=[('foo'.'<i8'), ('bar'.'<f4')])
The value returned is a view of the original array. They share memory space, so modifying the view also modifies the original data.

Field is a multi-dimensional array

In [188]: np.zeros((2.2), dtype=[('a', np.int32), ('b', np.float64, (3.3))])
array([[(0The [[0..0..0.], [0..0..0.], [0..0..0.]]),0The [[0..0..0.], [0..0..0.], [0..0..0.]]]], [[0The [[0..0..0.], [0..0..0.], [0..0..0.]]),0The [[0..0..0.], [0..0..0.], [0..0..0.]])]],
A 2 by 2 matrix is constructed with the first column of type int and the second column of a 3 by 3 float matrix.

We can check the shape values of each column like this:

>>> x = np.zeros((2.2), dtype=[('a', np.int32), ('b', np.float64, (3.3)))>>> x['a'].shape
>>> x['b'].shape
In addition to single-column access, we can also access multiple columns of data at once:

>>> a = np.zeros(3, dtype=[('a'.'i4'), ('b'.'i4'), ('c'.'f4')])
>>> a[['a'.'c']]
array([(0.0.), (0.0.), (0.0.)],
     dtype={'names': ['a'.'c'].'formats': ['<i4'.'<f4'].'offsets': [0.8].'itemsize':12})
Multiple column assignments simultaneously:

>>> a[['a'.'c']] = (2.3)
>>> a
array([(2.0.3.), (2.0.3.), (2.0.3.)],
      dtype=[('a'.'<i4'), ('b'.'<i4'), ('c'.'<f4')])
Simple exchange of column data:

>>> a[['a'.'c']] = a[['c'.'a']]
Record Arrays

Unconveniently, structured arrays can only be accessed through index, so NumPy provides a subclass of multi-dimensional arrays, numpy.recarray, which can then be accessed through attributes.

Let’s look at a few examples:

>>> recordarr = np.rec.array([(1.2..'Hello'), (2.3.."World")]..                   dtype=[('foo'.'i4'), ('bar'.'f4'), ('baz'.'S10')])
array([ 2..3.], dtype=float32)
>>> recordarr[1:2]
      dtype=[('foo'.'<i4'), ('bar'.'<f4'), ('baz'.'S10')])
>>> recordarr[1:2].foo
array([2], dtype=int32)
array([2], dtype=int32)
>>> recordarr[1].baz
Recarray returns a rec.array. In addition to using Np.rec. array, you can also use view:

In [190]: arr = np.array([(1.2..'Hello'), (2.3.."World")],... :... dtype=[('foo'.'i4'), ('bar'.'f4'), ('baz'.'a10')])

In [191]: arr
array([(1.2..b'Hello'), (2.3..b'World')],
      dtype=[('foo'.'<i4'), ('bar'.'<f4'), ('baz'.'S10')])

In [192]: arr.view(dtype=np.dtype((np.record, arr.dtype)),
     ...: ...                      type=np.recarray) ... : Out[192]:
rec.array([(1.2..b'Hello'), (2.3..b'World')],
          dtype=[('foo'.'<i4'), ('bar'.'<f4'), ('baz'.'S10')])
If it is rec.array, its dType is automatically converted to nP.record:

In [200]: recordarr.dtype
Out[200]: dtype((numpy.record, [('foo'.'<i4'), ('bar'.'<f4'), ('baz'.'S10')))Copy the code

To convert back to the original NP. ndarray type:

In [202]: recordarr.view(recordarr.dtype.fields or recordarr.dtype, np.ndarray)
array([(1.2..b'Hello'), (2.3..b'World')],
      dtype=[('foo'.'<i4'), ('bar'.'<f4'), ('baz'.'S10')])
Recarray returns numpy. Recarray if the field is of structural type, or numpy. Ndarray if the field is of non-structural type:

>>> recordarr = np.rec.array([('Hello', (1.2)), ("World", (3.4)),.                dtype=[('foo'.'S6'), ('bar', [('A'.int), ('B'.int)]])>>> type(
<class 'numpy.ndarray'> > > >type(
<class 'numpy.recarray'>
