A data structure is a collection of data elements that have a certain relationship (relationship) with each other
1. Storage mode of data structure
Sequential storage structure: A logical structure (relationship) representing data elements in terms of their relative positions in memory.
1. A pointer added to each data element that holds the address of another element. This pointer is used to represent the logical structure between data elements.
Logical structure and physical structure
Logical structure: Interrelationships (relationships) between elements
Four basic types
Collection: Data elements in a structure have no relationship other than that they belong to the same collection
Linear structure: There is a one-to-one relationship between data elements in a structure
Tree structure: There is a one-to-many relationship between data elements in the structure
Graph structure or mesh structure: There is a many-to-many relationship between data elements in a structure
Third, the operation of data structure
The main operations of data structure include: Create; Destruction (Destroy); Insert (Insert); Delete (Delete); Modification (Modify); Lookup (Search); Access (Access); Sort (Sort);
Iv. Linear table
1. Linear structure
Is one of the most common and simplest data structures
The basic feature is that the data elements in a linear table are ordered and finite
In this structure:
① There is a unique data element called “the first”;
(2) There is a unique data element called “the last”;
(3) Except for the first element, each element has a unique direct precursor;
④ Every element except the last has a single immediate successor
2. Sequential storage of linear tables
Use arrays to describe sequential tables
3. Chain storage of linear tables
A set of arbitrary storage cells is used to store the data elements in a linear list, which is called a linear linked list
An arbitrary group of nodes in a storage linked list that may be contiguous, discontiguous, or even scattered anywhere in memory
In order to correctly represent the logical relationship between nodes, when storing the value of each node, it is necessary to store the address (or location) indicating its direct successor node, which is called pointer or link. These two parts constitute the node structure in the linked list
A linked list is a linear list of N nodes linked together in their logical order by the pointer field of each node. A linked list that contains only one pointer field per node is called a singly linked list
For easy operation, always set a head pointer to the first node before the first node in the linked list
4. Bidirectional linked lists
A Double Linked List refers to that two pointer fields are set in each node of the Linked List, one pointing to the pointer field prior, which is the direct precursor, and the other pointing to the pointer field next, which is the direct successor of the Linked List. In this way, there are two chains with different directions in the Linked List, so it is called a Double Linked List
Similar to singly linked lists, bidirectionally linked lists can also facilitate some operations by adding a header pointer
Linking heads and tails can also form a circular list, which is called a bidirectional circular list
Bidirectional linked lists are introduced to overcome the drawback of single – linked lists
5. Node and type definition of bidirectional linked list
Queues and stacks
Stack and queue are two widely used data structures, both derived from linear table data structures, which are “operation-constrained” linear tables
Stacks are implemented in a variety of ways on computers:
Hard stack: Implemented using certain register sets or similar hardware in the CPU or using special areas of memory, this type of stack has limited capacity but is very fast
Soft stack: This type of stack is mainly implemented in memory. There are dynamic and static ways to achieve the two
1. Basic concept of stack
Stack: A linear table In which insertion and deletion operations are limited to one end of the table. It is also called LIFO (Last In First Out) or FILO (First In Last Out)
Top: The end of the stack that allows insertion and deletion, also known as the bottom of the table. Use the top pointer (top) to refer to the top element
Bottom: the fixed end, also known as the top of the table
Empty stack: Empty stack when there are no elements in the table
2, stack sequential storage representation
The sequential storage structure of the stack is referred to as the sequential stack, which is similar to the linear table. A one-dimensional array is used to store the stack, which is divided into static sequential stack and dynamic sequential stack according to whether the array is enlarged according to the need
Static sequential stacks are simple to implement, but cannot increase stack storage space as needed
The dynamic sequential stack increases the storage space of the stack as needed, but the implementation is somewhat complicated
3, stack chain storage representation
The chain storage structure of stack is called linked stack, which is a single linked list with limited operation. The insertion and deletion operations can only be carried out at the head position of the list. Therefore, it is unnecessary to attach the head node to the linked stack like the single linked list, and the top pointer is the head pointer of the linked list
4, stack application
Because of the inherent property of “last in, first out”, stack has become a common tool and data structure in program design. Infix expression variable suffix expression is an example of stack application
Postfix expression calculation
An infix expression is a postfix expression
Stack and recursive call implementation
5, the queue
Queue: A linear list with limited operations. Is a linear First In First Out (FIFO) table that only allows insertion at one end of the table and deletion at the other
Front: The end that allows deletion is called the front
Rear: The end where insertion is allowed is called the rear
A queue with no elements is called an empty queue. Add elements A1, A2… to the empty queue After an, a1 is the first element and an is the last element. Obviously, the order of exit can only be A1, A2… Queue changes are made on a first-in, first-out basis
6. Loop queues
In order to make full use of vector space and overcome the phenomenon of “false overflow”, the vector space allocated by Queue is regarded as a start-to-end ring, and this Queue is called Circular Queue.
7. Chain representation and implementation of queues
The chain storage structure of queue is referred to as chain queue, which is a single linked list that restricts only delete operations at the head and insert operations at the tail
There are two different types of nodes needed: data element nodes, queue head Pointers, and queue tail Pointers
Six, trees,
Tree structure is a kind of very important nonlinear structure. Intuitively, tree structure is a hierarchical structure defined by branch relation
Tree also has a wide range of applications in the field of computer, for example, in the compiler program, using tree to represent the syntax structure of the source program; In database system, information is organized by trees. When analyzing the behavior of algorithm, tree is used to describe its execution process and so on
1. Tree definition
A Tree is a finite set T of n(N ≧0) nodes. If n=0, it is called an empty Tree; otherwise, there is only one special node called Root of the Tree
If n>1, the remaining nodes are divided into M (m>0) disjoint subsets T1, T2, T3… Tm, in which each subset is itself a tree, called Subtree of the root.
This is a recursive definition of a tree, where you define a tree as a tree, and a tree that has only one node must consist of only roots
2. Basic terminology for trees
Node: A data element and its branches pointing to its children
Degree of node and degree of tree: the number of subtrees a node owns is called degree of node, and the maximum degree of node degree in the tree is called degree of tree
Left node, non-leaf node
Nodes with a degree of 0 in the middle of the tree are called leaf nodes (or terminal nodes). Correspondingly, nodes with a degree of 0 are called non-leaf nodes (or non-terminal nodes or branch nodes). Besides root nodes, branch nodes are also called internal nodes
Child, parent, and sibling
The root of a node’s subtree is called that node’s child, and that node is, in turn, the parent of its child
Forest: is a set of m(M ≧0) trees that do not intersect each other. Obviously, if the root node of a tree is deleted, the remaining sub-trees constitute a forest
3. Tree representation
An inverted tree is the most common representation
Nested set: a collection of sets, for any two sets, that either do not intersect or that one set contains another
Generalized table form
4. Binary tree
A Binary tree is a finite set of n(n≥0) nodes, called empty if n=0, otherwise: there is and only one special node, called Root
If n>1, the remaining nodes are divided into two disjoint subsets T1 and T2, which are called left and right subtrees respectively, and both of them are binary trees
It follows that the definition of binary trees is recursive
Binary trees play a very important role in tree structure, because of their simple structure, high storage efficiency, relatively simple tree operation algorithm, and any tree can be easily converted into binary tree structure, the terminology introduced in the previous section also applies to binary trees
5. Basic form of binary tree
6. The nature of binary trees
Property 1: In a non-empty binary tree, there are at most 2i nodes at the I layer (I ≧1)
Property 2: The binary tree with depth of K has at most 2K-1 nodes (k≧1)
Property 3: For any binary tree, if the number of leaf nodes is N0 and the number of leaf nodes of degree 2 is N2, then N0 =n2+1
Full binary tree
A Binary Tree with depth of K and 2K-1 nodes is called a Full Binary Tree.
Full binary tree features:
(1) The number of nodes on each layer is always the maximum number of nodes
(2) All branches have left and right subtrees
(3) The nodes of the full binary tree can be numbered consecutively. If the rule is to start from the root node, the principle of “top-down, from left to right” shall be followed
8. Complete binary trees
Complete Binary Tree: a Binary Tree with n nodes of depth K, if and only if each node corresponds to nodes numbered from 1 to n ina full Binary Tree of depth K
Or the first n nodes numbered from 1 to n in a full binary tree of depth K form a complete binary tree of depth k
2K-1 ≦ N ≦ 2K-1
A complete binary tree is a part of a full binary tree, and a full binary tree is a special case of a complete binary tree
The characteristic of a complete binary tree is that if the depth of the complete binary tree is K, all the leaf nodes appear at the k or K-1 layer. For any node, if the maximum level of its right subtree is L, the maximum level of its left subtree is L or L +1
9, binary tree storage structure
Sequential storage structure
The data elements of a complete binary tree are stored “top down, left to right” by a set of storage cells with contiguous addresses
For the nodes numbered I in the complete binary tree, the elements are stored in the subscript I -1 of the one-dimensional array. For the general binary tree, each node is stored in the one-dimensional array, compared to the nodes in the complete binary tree
10. Chain storage structure
Different node structures can be designed to form different chain storage structures
A binary linked list node has three fields: a data field and two pointer fields pointing to the left and right child nodes respectively
Trigeminal linked list node, in addition to the three domains of the binary linked list, add a pointer field, used to point to the parent node
Binary tree chain storage form
There is a general binary tree, a structure diagram stored as a binary list and a trigonal list
Traversal binary tree and its application
Traversing Binary Tree Traversing Binary Tree Traversing Binary Tree Traversing Binary Tree Traversing Binary Tree Traversing Binary Tree Traversing Binary Tree
The so-called access refers to the node to do some kind of processing, such as: output information, modify the value of the node, etc
Binary tree is a nonlinear structure, and each node may have two subtrees on the left and right. Therefore, it is necessary to find a rule to make the nodes of the binary tree arrange in a linear queue, so as to facilitate traversal
The basic composition of binary tree: root node, left subtree, right subtree. If you can traverse these three parts in turn, you’ve traversed the binary tree
If L, D and R represent traversal of the left subtree, root node and right subtree respectively, there are six traversal schemes: DLR, LDR, LRD, DRL, RDL and RLD. If the first left and then right is specified, there are only the first three cases, which are as follows:
DLR — Sequential (root) traversal
LDR – Middle (root) order traversal
LRD — back (root) order traversal
Build a binary search tree
Also known as binary sort tree
Build the tree recursively
To achieve the tree to add, delete, change, check function
Search algorithm
1. Sequential search
Starting with the first element in the table, the given value is compared with the keyword of each element in the table until the two match and the desired element is found. Otherwise, there is no element in the table and the search fails. On average, more than half of the elements in the table are compared, and in the worst case, n times
If the linear table is unordered, no matter the sequential storage structure or the chain storage structure, only sequential lookup
Even ordered linear tables, if the chain storage structure, can only use sequential lookup
2. Binary search
Binary search, also known as half search, is a high efficiency search method
Algorithm idea: First, compare the keyword recorded in the middle of the table with the search keyword, if the two are equal, the search is successful; Otherwise, use the middle position record to divide the table into the front and back two sub-tables. If the key word of the middle position record is greater than the search key word, the former sub-table is further searched; otherwise, the latter sub-table is further searched
Repeat the above process until you find a record that meets the criteria, so that the search succeeds, or until the child table does not exist, at which point the search fails
Advantages and disadvantages: the advantages of half search method is less comparison times, fast search speed, good average performance; Its disadvantage is that the list to be looked up is ordered, and it is difficult to insert and delete, so the split search method is suitable for the ordered list that is not frequently changed but frequently searched
Code implementation:
def binary_chop(alist, data) :
""" recursive solution to binary lookup :param Alist: :return: ""
n = len(alist)
if n < 1:
return False
mid = n // 2 #
if alist[mid] > data:
return binary_chop(alist[0:mid], data)
elif alist[mid] < data:
return binary_chop(alist[mid + 1:], data)
else:
return True
Copy the code
Eight, sorting algorithm
Sorting is the process of rearranging a batch (group) of records in any order into a sequence of records ordered by key, which is defined as:
Given a sequence of records: {R1, R2… Rn}, its corresponding keyword sequence is {K1, K2… , Kn}, determine 1, 2… A permutation of n p1, P2… Pn, make the corresponding keywords meet the following non-decreasing (or non-increasing) relation: Kp1≤Kp2 ≤… ≤Kpn sequence {Kp1,Kp2… ,Kpn}, and this operation is called sorting
The key Ki can be the primary key to record Ri, the secondary key, or a combination of several data items
Ki is the primary keyword: the sorted result is unique
Ki is a secondary keyword: sorting results are not unique
1. Bubble sort
Bubble Sort (English: Bubble Sort) is a simple sorting algorithm. It iterates through the sequence to be sorted, comparing two elements at a time and swapping them if they are in the wrong order. The process of iterating over a column is repeated until no more swaps are needed, that is, the list has been sorted. This method is named because smaller elements float to the top of the list through swaps
Compare adjacent elements. If the first is greater than the second (in ascending order), swap them both
Do the same thing for each pair of adjacent elements, starting with the first pair and ending with the last pair, and when you’re done, the last element will be the largest
Repeat this step for all elements except the last one
Keep repeating the above steps for fewer and fewer elements at a time until there are no more pairs to compare
Diagram of exchange process (first) :
Then we need to carry out n-1 bubbling process, and the corresponding comparison times of each time are shown in the figure below:
Code implementation:
# Bubble sort
def bubble_sort(alist) :
for j in range(len(alist) - 1.0, -1) :# j represents the number of comparisons required for each traversal, which is gradually decreasing
for i in range(j):
if alist[i] > alist[i + 1]:
alist[i], alist[i + 1] = alist[i + 1], alist[i]
li = [54.26.93.17.77.31.44.55.20]
bubble_sort(li)
print(li)
Copy the code
2. Insert sort
Insertion Sort is a simple and intuitive sorting algorithm. It works by building an ordered sequence, scanning back to front in the sorted sequence for unsorted data, finding the appropriate location and inserting it. Insertion sort in the implementation, in the process of scanning from back to front, it is necessary to repeatedly move the sorted elements backward step by step to provide insertion space for the latest elements
Insert sort analysis:
Code implementation:
# select sort
def insert_sort(alist) :
Insert forward from the second position, the element with subscript 1
for i in range(1.len(alist)):
# Compare forward from the ith element, if less than the previous element, swap places
for j in range(i, 0, -1) :if alist[j] < alist[j - 1]:
alist[j], alist[j - 1] = alist[j - 1], alist[j]
alist = [54.26.93.17.77.31.44.55.20]
insert_sort(alist)
print(alist)
Copy the code
3. Selection sort
Selection sort is a simple and intuitive sorting algorithm. Here’s how it works. First, find the smallest (large) element in the unsorted sequence and store it at the beginning of the sorted sequence. Then, continue to find the smallest (large) element from the remaining unsorted elements and place it at the end of the sorted sequence. And so on, until all the elements are sorted
The main advantage of selection sort has to do with data movement; if an element is in the correct final position, it will not be moved. Each time a pair of elements is swapped, at least one of them will be moved to its final position, so sorting a list of N elements does at most n-1 swaps. Of all the sorting methods that rely entirely on swapping to move elements, selective sorting is the best one
Selective sorting analysis
Sorting process:
Code implementation:
# select sort
def selection_sort(alist) :
n = len(alist)
N -1 selection operation is required
for i in range(n - 1) :Record the minimum position
min_index = i
Select minimum data from position I +1 to end
for j in range(i + 1, n):
if alist[j] < alist[min_index]:
min_index = j
If the selected data is not in the correct location, swap
ifmin_index ! = i: alist[i], alist[min_index] = alist[min_index], alist[i] alist = [54.226.93.17.77.31.44.55.20]
selection_sort(alist)
print(alist)
Copy the code
quicksort
Quicksort (English: Quicksort, also known as partition-exchange sort, divides the sorted data into two separate parts, where all the data in one part is smaller than all the data in the other part, and then Quicksort the two parts separately. The whole sorting process can be done recursively, so that the whole data becomes an ordered sequence
Steps as follows:
(1) Pick an element from the sequence, called pivot,
(2) To reorder the sequence, all elements smaller than the base value are placed in front of the base value, all elements larger than the base value are placed behind the base value (the same number can go to any side), after the end of the partition, the base will be in the middle of the sequence, this is called partition operation
(3) Recursively sort the subsequences of elements smaller than the base value and those larger than the base value
At the bottom of recursion, the sequence is always sorted by zero or one. As the recursion continues, the algorithm always ends because in each iteration, it will place at least one element in its last position
Analysis of quicksort
Code implementation:
# # Quicksort
@print_execute_time
def quick_sort(alist, start, end) :
""" quicksort ""
# recursive exit conditions
if start >= end:
return
Set the starting element as the base element for the location to find
mid = alist[start]
# low is the cursor moving from left to right on the left side of the sequence
low = start
# high is the cursor moving from right to left on the right side of the sequence
high = end
while low < high:
# If low and high do not overlap and high points to an element no smaller than the base element, high moves to the left
while low < high and alist[high] >= mid:
high -= 1
# put the element that high points to at low's position
alist[low] = alist[high]
# If low and high do not overlap, low points to an element smaller than the base element, then low moves to the right
while low < high and alist[low] < mid:
low += 1
# put the element pointed to by low at high
alist[high] = alist[low]
# after exiting the loop, low coincides with high, indicating the correct position of the base element
# Put the base element there
alist[low] = mid
Quicksort the subsequence to the left of the base element
quick_sort(alist, start, low - 1)
Quicksort the subsequence to the right of the base element
quick_sort(alist, low + 1, end)
alist = [54.26.93.17.77.31.44.55.20]
quick_sort(alist, 0.len(alist) - 1)
print(alist)
Copy the code