Transaction management
Transaction is the basic working unit of database system operation, equivalent to the process in the operating system, transaction has ACID property. From the user’s perspective, all or none of the operations in a transaction are done.
- Atomic Atomicity
- Consistency Consistency
- Isolation Isolation means that it cannot be disturbed by other transactions
- Durability, they are permanent once they submit changes
Handle Concurrency control
When multiple users concurrently operate on the same data, without control, updates can be lost, stale data can be read (inconsistent analysis issues), and dirty data can be read (depending on issues with uncommitted updates). The main method of concurrency control is to use blocking technology, which has two types: exclusive blocking and shared blocking.
- Exclusive blockade (x-blockade for short). If transaction T implements X blockade on data A, only T is allowed to read and modify data A. Other transactions can perform subsequent actions only after T removes X blockade.
- Shared blockade (s-blockade for short). The X-blockade was too strict, so the S-blockade was created. If transaction T implements an S-block on data A, then transaction T is allowed to read data A, but not modify data A, and no transactions are allowed until all S-blocks are released
Parallelism handles parallel control
Parallel control is handled mainly by blocking protocols.
- Level one lockdown protocol. Transaction T must lock data R with X before modifying it until the end of the transaction.
- Secondary lockdown protocol. First level lockdown protocol plus transaction T locks data R before reading it, and releases the S lock after reading it.
- Level three lockdown protocol. The first level lockdown protocol plus transaction T locks R before it reads it until the end of the transaction.
- Two – segment lock protocol. All transactions must lock and unlock data items in two phases.
The deadlock problem
Using a blocking approach can prevent data inconsistencies, but it can cause deadlock problems. Deadlocks are usually avoided by prevention and division.
Fault and Recovery
Database faults are classified into four types:
- Transaction failure. A situation in which a transaction is cancelled without properly terminating.
- The system fails. A condition in which a transaction terminates abnormally and memory information is lost.
- The medium is faulty. The failure of physical storage media is less likely but more destructive.
- Computer virus.
For different faults, you can use different recovery policies:
- Transaction failover. Uncompleted transactions should be cancelled and the system automatically completes the recovery.
- System fault recovery. Based on the logs, the database will undo the transaction and redo the redo queue transaction upon restart.
- Media fault recovery depends on the backup copy. If the database is restored to the status of the latest backup, data will be lost.
- Recovery technology with checkpoints. Large database systems, such as Oracle, provide security mechanisms. However, not all small databases have corresponding functions. Sometimes, manual assistance is required to ensure database security and recovery.
Backup and Restoration
To prepare for the possibility of database corruption, specify an appropriate backup and recovery plan. Principles for making a backup plan include:
- Ensure that data is lost with minimal or no loss at all
- The backup and restoration time should be as short as possible to ensure maximum system availability
Based on the backup content, the backup can be physical backup or logical backup. Based on efficiency, there are three backup modes: full backup, incremental backup, and cumulative backup.
Distributed database
concept
Distributed database system is relative to the centralized system, the exact meaning is through the combination of database technology and network technology, composed of a set of data, this group of data distribution on different computers in the computer network, each node in the network has the ability of independent processing, become a site autonomy, it can perform local application, at the same time, Each node can also perform global applications through the network communication subsystem.
Responsible for the establishment, query, update, copy, Management and maintenance of the Distributed Database software, known as the Distributed Database Management System (DDBMS).
Distributed database system has the following characteristics:
- Distribution of data
- Unity. The unity of data logic and data management.
- Transparency. Users do not need to care about where data is stored.
System architecture
The schema of a distributed database system has six levels:
- The out-of-global mode is the user view of the global application
- A global conceptual schema that defines the overall logical structure of data in a distributed database
- Fragmentation patterns
- Distribution patterns of
- Local conceptual pattern
- Local schema, similar to the schema in a centralized database system
Parallel database system
Compared with the technology development path in the Field of Internet, the parallel database system can not be seen at present. The Sequent system mentioned in the textbook is also poorly available on the web, so I suggest you skip this section. However, the parallel processing technology mentioned in the following may combine with the current big data processing and artificial intelligence platform, so those who are interested can study it by themselves.
The data warehouse
Data Warehouse concept: Data Warehouse is a topic-oriented, integrated, relatively stable, time-varying collection of Data used to support management decisions.
The structure of the data warehouse includes:
- The data source
- Data storage and management
- OLAP server
- Front-end tools, including various report tools, query tools, data analysis tools, data mining tools and so on
The resources
- The difference between concurrency and parallelism
- Sequence database
- Sequent Computer Systems