Mother earth, kind and dark, may her spirit ever be in thy bosom! A Chang and The Book of Mountains and Seas by Lu Xun
Looking at an old problem from a new perspective requires creative thinking. — Albert Einstein
preface
Let’s forget about data structure, let’s think of data as a determiner, just structure, just structure, what is a structure? Structure appears in various industries, such as architectural structure, human body structure and material structure. What is that structure:
Structure is the arrangement and organization of interrelated elements within a system or material.
Intuitively, what people can feel is the structure of the house, the arrangement between each room in a house. And then into the bedroom, the arrangement between the furniture. If well designed, the room will usually be pleasant to the person who enters it. So if the arrangement is not good, then it may make people have a bad feeling about the room. For example, put your bedroom light switch on the door instead of next to your bed, so you don’t have to get out of bed and turn it off when you go to sleep. The general design is to put a switch at the door, the purpose of such design is to be able to turn on the light directly when entering the door. Because architecture is always about people, we always talk about livable architecture.
So we try to add data to the structure, then we can get the definition of analogous data structure, in some data, the arrangement and organization of data. In computer science, this data is a very broad term, and we use the very common CTRL + Z (undo operation) to illustrate it. One by one, our actions are recorded by the operating system as follows:On the axis of time, the first action is at the bottom, so that when we CTRL + Z we can revert to the action closest to the current time. In this case, operations are also data. Data in computer science can mean many things, but in other fields data is more numerical. This form of organizing data is also intuitive.
Let’s take a look at the definition of data structure in Wikipedia:
In computer science, a data structure is a way of storing and organizing data in a computer.
The analogy analysis, it looks like we’re in the direction of the analogy is correct, the storage for containers we can understand, I think of before and my colleague before a cook, he brought a bucket of kimchi from home, I remember was that oil drum, this barrel is thinner, the mouth when he let me take the acid beans, this once let me very pain, because use chopsticks out is really not easy, We can think of the dish in the pickle as data in a data structure, stored in that oil drum, organized in a random way. To make a joke quietly, the data structure design is quite unreasonable, except when it is transported, it is slightly convenient, but when it is eaten, it is really inconvenient.
An introduction to data structures
The data to be processed by the program
Those of you who have studied programming know that the way to use your brain to drive a computer is through programming. Niklaus Wirth, the renowned Swiss computer scientist and father of Pascal, came up with a famous formula that won the Turing Prize, the highest honor in computer science. The formula simply read:
Algorithm + Data Structures = Program
As software programs evolve today, we can say that this formula is not so big and appropriate, because software is starting to catch up with the term engineering, and there are more factors to consider. Programming begins with solving two problems: algorithm design and data structure design. (Many high-level languages have fairly mature data structures built into them, but this doesn’t mean that learning data structures has lost its significance. On the one hand, it is a point, and on the other hand, it can help us solve a lot of problems.) The algorithm is the strategy to deal with the problem, while the data structure is the data model to describe the problem information, and the program is a set of instructions for the computer to deal with the problem information according to the strategy to deal with the problem.
The purpose of our programming is to allow computers to help people automatically complete the complex tasks to be handled. The fundamental question of computer science and technology is: what can be automated and effectively automated? Specific from the actual problem to the final computer solution, how to go through the process?
What we often need to analyze is the data contained in the information in the problem model, what is the relationship between the data, and what form is stored in the computer, and then the data structure is formed. In the 1930s and 1940s, the original purpose of the invention of electronic computers is to carry out scientific and engineering calculations, the object of its processing is purely numerical information, usually people call this kind of problem numerical calculation (specifically, it is the effective use of computers to solve mathematical problems approximate solution method and process).
In recent decades, with the rapid development of the computer, which is not only in the performance of the computer itself continues to improve, more importantly, the data can be input into the computer, its category has been greatly expanded, for example, symbols, sounds, images and other information can be stored in the computer through coding, and this corresponding, The processing object of computer has also developed from simple pure numerical information to information with certain structure, and the application field of computer is also expanding.
So-called “non numerical problems, in order to distinguish from the perspective of the” value problem “mentioned above, the numerical problems involving data and the relationship between data (pay attention to the relationship between here, rough says we can believe that the relationship between the data and data form the data structure), generally cannot be described with mathematical formulas, equations, etc. Such as sorting problems, retrieval problems and so on. It is necessary to design another data description method and corresponding algorithm to deal with it.
For example, finding the value of π can be considered a typical numerical problem. For non-numerical calculations, let’s take a look at a real life scenario, such as making phone calls. Smartphone contacts are usually sorted by the first letter of the last name in pinyin, so that they can be quickly searched. This is actually a non-numerical problem. We are looking for data, and each data is composed of name + phone number. When we are looking for data, the search strategies are as follows:
- Search in sequence (search one by one, if the person’s first letter is at the top of the list, it’s lucky, but if the person’s last name is at the bottom of the list, for example, zhang, if you have to search to the end of the list to find the person you want to contact)
- Search according to pinyin initials, and then search in order. If you have a lot of friends with the surname Zhang, you may still have to search for them for a while. (Most smartphones support searching by name, which is actually a different search strategy.)
Another example is the traffic lights that we are used to every day. We will naturally ask such a question: how many colors of traffic lights should be set at the intersection to maintain the normal order. We can get the answer to this question very quickly: two kinds. But what if you put it into a computer? It’s not an easy problem for a computer to solve. The first problem to be solved is how to store the information in the problem in the computer, and then on this basis can design an algorithm to solve.
Let’s try to solve this problem, or model the information involved in the problem and input it into the computer. The intersection is shown as follows:The object involved in the question, four intersection ABCD, and the corresponding access. Let’s call it AC for A path from A to C. Traffic in one direction is blocked in other directions. Ac-bd indicates that AC and BD cannot be used at the same time. Assume that the rules for turning left are consistent with going straight, and that turning right is allowed at any time. Based on the above analysis, we can draw the following road map:This is what we call a graph data structure, and each path we call a node.
Let’s look at the file directory of the computer again, we know that a folder can contain a folder, can be wrapped layer by layer, in the operating system we also abstract each file directory into a node, then the structure of the file directory is as follows:This is often called a tree in data structures, and the top file is called the root.
Introduction of data structures
As we can see from the previous example, tree, graph structure model of non-numerical questions, can’t use equation and so on, so the key to solve this problem is no longer a mathematics and calculation method, but find out the problems to deal with the connection between the data and the data, the organization forms, storage, said method, etc., to design a model for the computer problem solving.
Generally speaking, data structure is organized by data elements according to their own internal logical organization. The description of the logical relationship between data elements is called logical structure of data (trees and graphs mentioned above can be considered logical structure); The data must be stored in the computer, and the data storage structure is the realization form of the data structure. Is the representation in the computer; In addition, it makes sense to talk about a data structure only if you also talk about the operations performed on that type. A logical storage structure (sometimes called abstract data type) can have multiple storage structures, on which the efficiency of data processing is different.
Basic concepts of data structures
From the above discussion, I believe that you have had some basic understanding of data structure. Here we introduce some common terms of data structure:
- Data (Data)
In computer science, data is the collective term for all symbols that can be entered into a computer and processed by a computer program.
- Data Elements
The basic units of data, also called “elements” and “nodes”, are usually considered and treated as a unit in a computer program. An element can contain multiple data items.
- Data Item
Data item is the smallest identification unit with independent meaning, and is the most basic and non-separable data unit of data
- The data structure
Consists of a data object and the relationships between all the data members of that object.
Data organization consists of three elements: logical structure of data, storage structure of data and operation of data.
Logical structure of data
The logical structure of the data, which reflects our interpretation of the meaning of the data, can be represented as a set of data and the relationships between those data. Logical relationships of data are divided into the following types:
- A collection of
Just put in a container, like the pickled cabbage I mentioned above, there is no connection between the dishes.
- Linear structure
The relationship between nodes is one to one, like a queue, like an undo operation.
- Tree structure.
The nodes have a one-to-many relationship, like the file directory above.
- Graph structure
The nodes have a many-to-many relationship, as shown in the diagram above.
The storage structure of data
Data storage structure, also known as physical structure, is the representation method of Data and its logical structure in the computer. How to store Data in the computer is essentially the allocation of memory units, and the Data Type in the computer language is used in the specific implementation.
The data type here can also be understood as a container, so we take one container for integers and a separate container for decimals. There are four common data storage methods:
- Sequential storage structure
A group of contiguous storage units is used to store data elements in turn, and the logical relationship between data is reflected by the adjacent storage positions of elements. Arrays are a common data storage structure
- Chain storage structure
Add a pointer to each data element (without a pointer by reference) to record the logical relationship between data elements.
- Index storage structure
In addition to storing node information, an additional index table is created, similar to the index items in a dictionary.
- Hash storage structure
Hash storage mode, take the keyword of the node as the independent variable, through the function relationship, directly calculate the storage address of the node.
Operation of data
The definition of operation depends on the logical structure of data. Knowing the data in the problem and the connection between data, we can design the corresponding data processing method. The common operation is as follows:
- Initialization: Sets the initial value of the storage structure or applies for storage space
- Null: Indicates whether the storage space has no valid values
- Find length: count the number of elements
- Find: Determines whether the specified element is included
- Traversal: Access all elements in some order, each element only once
- Value: Gets the value of the specified element
- Insert: Adds the specified element
- Delete: Deletes the specified element
Write in the last
This is the prelude to the study of data structures and algorithms. My idea is to lay a solid foundation before explaining the topic of algorithms. Some words may only have some experience to understand the same, like junior high school learning Mr. Lu Xun’s article “A long and the Classics of mountains and Seas”. Sometimes, it takes experience to understand.
The resources
- Zhou Xingni, Ren Zhiyuan, Ma Yanzhuo, Fan Kai. New Perspectives on Data Structure and Algorithm Analysis
- “Advanced Algebra concise course” blue to middle