Talk about the efficiency of Python arrays

If we need a list that contains only numbers, using an array is more efficient than using a list. Arrays also support all operations related to mutable sequences, such as removing an element from a list (.pop), inserting an element (.insert), and appending multiple values from another sequence at the end of the list at once (.extend). In addition, arrays define more efficient ways to read (.frombytes) and write (.tofile) from files.

Creating an array requires a type code, such as array(‘d’), which is used to represent the underlying C data type. Generally we use Python is written in C language implementation, so it is also called CPython.

Python defines the following type codes:

The type code	C type	Python types	Bytes of	annotation
‘b’	signed char	int	1
‘B’	unsigned char	int	1
‘u’	Py_UNICODE	Unicode characters	2	(1)
‘h’	signed short	int	2
‘H’	unsigned short	int	2
‘i’	signed int	int	2
‘I’	unsigned int	int	2
‘l’	signed long	int	4
‘L’	unsigned long	int	4
‘q’	signed long long	int	8
‘Q’	unsigned long long	int	8
‘f’	float	float	4
‘d’	double	float	8

Note (1) : The ‘u’ type code corresponds to an obsolete Unicode character in Python (Py_UNICODE is wchar_t). Depending on the system platform, it may be 16 or 32 bits.

For example, if a b-type code represents a signed char, array(‘ b ‘) creates an array that can hold only one byte of integers, ranging from -128 to 127. By doing this, you can save space even if the sequence is long and has a lot of numbers.

If an array is typed, it cannot hold data that is not of a defined type.

Luciano Ramalho gives an example to illustrate the efficiency of arrays. An array of 10 million random floating-point numbers is created, the data is written, and the data is read.

from array import array
from random import random

floats = array('d', (random() for i in range(10 ** 7)))
logging.info('floats[-1] -> %s', floats[-1])

fp = open('floats.bin', 'wb')
floats.tofile(fp)
fp.close()

floats2 = array('d')
fp = open('floats.bin', 'rb')
floats2.fromfile(fp, 10 ** 7)
fp.close()
logging.info('floats2[-1] -> %s', floats2[-1])
logging.info('floats2==floats -> %s', floats2 == floats)
Copy the code

Running results:

Info-floats [-1] -> 0.9160358679542017 INFo-floatS2 [-1] -> 0.9160358679542017 INFo-floatS2 ==floats -> TrueCopy the code

The code performance is analyzed through the cProfile module, and the following results are output:

Info-192 Function calls (180 Primitive calls) in 0.098 seconds Ordered by: Cumulative time ncalls tottime perCall cumtime percall filename: Lineno (function) 1 0.061 0.061 0.061 0.061 {method cumulative time ncalls tottime percall cumtime percall filename: Lineno (function) 1 0.061 0.061 0.061 0.061 'fromfile' of 'array.array' objects} 1 0.030 0.030 0.030 0.030 0.030 {method 'tofile' of 'array.array' objects} 2 0.007 0.003 0.003 {built-in method IO. Open}...Copy the code

As you can see, it takes about 0.01 seconds to create an array of 10 million random floating point numbers and to read and write files. The resulting file size is about 73M.

Start by creating an iterable using a generator expression,**Represents a power, then generates a double-precision floating-point array (type code ‘d’);
The -1 index of an array retrieves the last element of the array.
“Wb” opens the file in binary write mode. W is short for write. And b is short for binary;

Binary / ˈ ba ɪ n goes to ri

using only 0 and 1 as a system of numbers 4. When creating an array, you can initialize it or create an empty array without initialization, such as array(‘d’). 5. The second input to the fromfile() method specifies the maximum range of values; 6. You can see that the array read from the file is exactly the same as the array saved.

Because array.tofile writes data to a binary file, it is much faster than writing directly to a text file. According to statistics, the difference in performance between the two can be nearly seven times.

Talk about the efficiency of Python arrays

Related Posts

Docker series (2)– Container and image usage

Computer Networking – Overview

Dbas know these 17 Linux commands