The nature of Python sequence types

In this blog, we’ll explore Python’s various “sequence” classes, and the nature of the three commonly used built-in data structures — lists, tuples, and strings.

If you haven’t noticed, these classes all have one obvious common feature: they can be used to store multiple data elements. The most important feature is that each class supports subscript (index) access to the elements of the sequence, such as the syntax Seq[I]. Each of the above classes is represented by a simple data structure called an array.

But readers familiar with Python may know that there are some differences between these three data structures: tuples and strings, for example, cannot be modified, and lists can.

An array structure in computer memory

In computer architecture, we know that computer main memory consists of bits of information, which are usually grouped into larger units that depend on the precise architecture of the system. A typical unit is a byte, equivalent to eight bits.

Computer systems have a huge number of bytes of storage, so how do we find out which bytes of our information is stored? The answer is the familiar storage address. Based on the storage address, any byte in main memory can be accessed effectively. In effect, each byte of storage is associated with a unique binary number as its address. In the following figure, each byte is assigned a storage address:

In general,A programming languageRecords the relationship between an identifier and the address stored by its associated value. For example, when we declare an identifierMay be associated with a value in memory, and the identifierIt might be associated with some other value. A set of related variables can be stored, one after another, in a contiguous area of computer memory. We call this approachAn array of.

Consider an example in Python where a text string HELLO is stored as an ordered list of characters, assuming that each Unicode character of the string requires two bytes of storage. The bottom number is the index value of the string.

As you can see, arrays can store multiple values without having to construct multiple variables with specific indexes to specify each item within them, and are used in almost all programming languages (such as C, Java, C#, C++), but Python has the advantage. If you’re familiar with Python building lists, you probably know that you don’t need to define arrays or lists in advance. Instead, in Python, lists are dynamic, and we can constantly add as many data elements as we want to them. Next, let’s take a look at Python lists (which can be quickly skimmed or skipped for those already familiar).

Python list

Python list operations

  • Syntax for creating a list:

[ele1, ele2, ele3, ele4, ...]

  • Syntax for creating tuples:

(ele1, ele2, ele3, ele4, ...)

Tuples are more efficient than lists because tuples are immutable, so there is no need to create dynamic arrays with free space.

We start by creating a list in Python’s IDE, and then take a look at the built-in operations in the list part. We create a list called test_List, then modify (insert or delete) elements, reverse or clear the list, as follows:

>>> test_list = []	Create an empty list named TEST_list
>>> test_list.append("Hello")
>>> test_list.append("World")
>>> test_list
['Hello'.'World']
>>> test_list = ["Hello"."Array".2019."easy learning"."DataStructure"]	# reassign test_list
>>> len(test_list)	Find the length of the list
5
>>> test_list[2] = 1024	# Change the list element
>>> test_list
['Hello'.'Array'.1024.'easy learning'.'DataStructure'] > > >>>> test_list.insert(1."I love")	Insert an element at the specified position in the list
>>> test_list
['Hello'.'I love'.'Array'.1024.'easy learning'.'DataStructure']
>>> test_list.append(2020)	Add an element to the end of the list
>>> test_list
['Hello'.'I love'.'Array'.1024.'easy learning'.'DataStructure'.2020] > > >>>> test_list.pop(1)	# delete the element at the specified position
'I love'
>>> test_list.remove(2020)	Delete the specified element
>>> 
>>> test_list.index('Hello')	Find the index value of an element
0
>>> test_list.index('hello')	If an element is not in the list, ValueError is returned
Traceback (most recent call last):
  File "<pyshell#11>", line 1.in <module>
    test_list.index('hello')
ValueError: 'hello' is not in list
>>> 
>>> test_list.reverse()	# Reverse the entire list
>>> test_list
['DataStructure'.'easy learning'.2019.'Array'.'Hello']
>>> test_list.clear()	# Clear the list
>>> test_list
[]
Copy the code

If we look at the code above, we can see that the operations related to list — add, delete, change and check — are quite powerful. There are also some built-in methods that are not shown here, which are left to the reader to discover and experience.

The basics behind memory allocation for Python lists

So let’s look at this extra spatial demonstration through coding practices and the relationship between the actual size of an array held in memory and a given size.

Head to Jupyter Notebook for practice. Or use whatever editor or development environment you choose. Copy the code written below.

Importing the sys module makes it easy to use the gestsizeof function
import sys

# set n
n = 20
# set empty list
list = []
for i in range(n):
    a = len(list)
    Getsizeof is called to give the true number of bytes of an object stored in Python
    b = sys.getsizeof(list)
    print('Length:{0:3d}; Size of bytes:{1:4d}'.format(a, b))
    # Increase length by one
    list.append(n)
Copy the code

Run the code and see the following output:

Length: 1.


When Length:10, the number of bytes jumps from 64 to 192, saving references to 16 objects,


It jumps until Length: 17, so 264 bytes should theoretically hold 25 objects


But because we set n=20 in our code, then the program stops.

We can see that Python’s built-in List class is smart enough to know that it will give them extra sizes when extra space is needed to allocate data, so how exactly is this implemented?

Well, the answer is a dynamic array. A list() class, enclosed in brackets [], instructs the various methods on a book or document: append, insert, pop… After a tour of various ides, yes I think I have learned.

But in fact, the principle behind it is really not simple, for example, I take an example: A[-1] how to achieve this operation? How to implement the list slice function? How do I write pop() to delete the rightmost element of the list by default (popleft simply deletes the leftmost element)? . These features are great to use, but really hard to implement on your own (I’m still learning, too, so please shush!). If we can learn and understand it, it will certainly enhance our understanding of arrays as structures.

The dynamic array

What is a dynamic array

A dynamic array is a contiguous region of memory that grows dynamically as new data is inserted. In static arrays, we need to specify the size at allocation time. When we define an array, the computer has already allocated memory for us to store it, and we can’t actually expand the array because its size is fixed. For example, if we allocate an array of size 10, we cannot insert more than 10 items.

But dynamic arrays automatically resize when needed. This is a bit like the Python list we use to store any number of items without specifying the size at allocation time.

So the key to implementing a dynamic array implementation is — how do you extend an array? When list1 is full and there are new elements to add to the list, we do the following steps to overcome the size limitation:

  1. Allocate a new array list2 with a larger capacity
  2. Set list2[I] = list1[I] (I =0,1,2… , n-1), where n is the current number of the project
  3. Set list1 = list2, that is, list2 is referring to our new list as a new array.
  4. Then, we simply insert (add) the new element to our list list1.

The next question to consider is, how big should the new array be? In general, the size of the new array is twice the size of the old one. We will implement the concept of dynamic arrays programmatically in Python and create a simple code that is much less powerful than Python.

Python code to implement dynamic arrays

In Python, we use the ctypes built-in library to create our own dynamic array classes. Since the Ctypes module provides support for raw arrays, you can learn about Ctypes in the official documentation for faster learning about arrays. For Python’s public and private methods, we use double underscores **__** before method names to keep them hidden as follows:

# -*- coding: utf-8 -*-
# @Time : 2019-11-01 17:10
# @Author : yuzhou_1su
# @ContactMe : https://blog.csdn.net/yuzhou_1shu
# @File : DynamicArray.py
# @Software : PyCharm

import ctypes


class DynamicArray:
    """A dynamic array class akin to a simplified Python list."""

    def __init__(self):
        """Create an empty array."""
        self.n = 0             # count actual elements
        self.capacity = 1      # default array capacity
        self.A = self._make_array(self.capacity)      # low-level array

    def is_empty(self):
        """ Return True if array is empty"""
        return self.n == 0

    def __len__(self):
        """Return numbers of elements stored in the array."""
        return self.n

    def __getitem__(self, i):
        """Return element at index i."""
        if not 0 <= i < self.n:
            # Check it i index is in bounds of array
            raise ValueError('invalid index')
        return self.A[i]

    def append(self, obj):
        """Add object to end of the array."""
        if self.n == self.capacity:
            # Double capacity if not enough room
            self._resize(2 * self.capacity)
        self.A[self.n] = obj    # Set self.n index to obj
        self.n += 1

    def _resize(self, c):
        """Resize internal array to capacity c."""
        B = self._make_array(c)     # New bigger array
        for k in range(self.n):    # Reference all existing values
            B[k] = self.A[k]
        self.A = B          # Call A the new bigger array
        self.capacity = c   # Reset the capacity

    @staticmethod
    def _make_array(c):
        """Return new array with capacity c."""
        return (c * ctypes.py_object)()

    def insert(self, k, value):
        """Insert value at position k."""
        if self.n == self.capacity:
            self._resize(2 * self.capacity)
        for j in range(self.n, k, - 1):
            self.A[j] = self.A[j- 1]
        self.A[k] = value
        self.n += 1

    def pop(self, index=0):
        """Remove item at index (default first)."""
        if index >= self.n or index < 0:
            raise ValueError('invalid index')
        for i in range(index, self.n- 1):
            self.A[i] = self.A[i+1]
        self.A[self.n - 1] = None
        self.n -= 1

    def remove(self, value):
        """Remove the first occurrence of a value in the array."""
        for k in range(self.n):
            if self.A[k] == value:
                for j in range(k, self.n - 1):
                    self.A[j] = self.A[j+1]
                self.A[self.n - 1] = None
                self.n -= 1
                return
        raise ValueError('value not found')

    def _print(self):
        """Print the array."""
        for i in range(self.n):
            print(self.A[i], end=' ')
        print()

Copy the code

Test dynamic array Python code

Above we have implemented a dynamic array class, I believe are very excited, next let’s test, see if it can succeed? In the same file, write the following test code:

def main(a):
    # Instantiate
    mylist = DynamicArray()

    # Append new element
    mylist.append(10)
    mylist.append(9)
    mylist.append(8)
    # Insert new element in given position
    mylist.insert(1.1024)
    mylist.insert(2.2019)
    # Check length
    print('The array length is: ', mylist.__len__())
    # Print the array
    print('Print the array: ')
    mylist._print()
    # Index
    print('The element at index 1 is :', mylist[1])
    # Remove element
    print('Remove 2019 in array:')
    mylist.remove(2019)
    mylist._print()
    # Pop element in given position
    print('Pop pos 2 in array:')
    # mylist.pop()
    mylist.pop(2)
    mylist._print()


if __name__ == '__main__':
    main()
Copy the code

The test results

Exciting moment. Here are the results. Please combine the test code with the structure of the array to understand, if there are omissions, welcome to point out.

The array length is:  5Print the array:10 1024 2019 9 8 
The element at index 1 is : 1024
Remove 2019 in array:
10 1024 9 8 
Pop pos 2 in array:
10 1024 8 
Copy the code

conclusion

From this introduction, we know that arrays have static and dynamic types. In this blog, we’ll focus on what a dynamic array is and implement it in Python code. Hopefully, you’ll learn about arrays in a complicated way. To sum up the statement, in fact, the more simple the operation, the realization principle behind may be very complex.