With the tide of cloud computing coming, the traditional database market is facing a situation of reshuffle. A group of new forces including cloud database rise, shaking the monopoly position of traditional database, while the cloud native database dominated by cloud vendors pushes this “change” to a climax.

What changes will databases face in the cloud era? What are the unique advantages of cloud native databases? At DTCC 2019, Dr. Feifei Li, Vice President of Alibaba, gave a wonderful share on “Next Generation Cloud Native Database Technology and Trend”.



Fei Fei Li (alias: Fei Dao), vice President of Alibaba Group, senior researcher, chief database scientist of Dharma Academy, head of database Product Division of Ali Cloud Intelligent Business Group, ACM Outstanding Scientist.

General trend: Cloud database market share growth rapidly

The figure below shows Gartner’s report on the global database market share. According to the report, the current global database market share is approximately usd 40 billion, of which, The Chinese database market share accounts for 3.7%, or approximately USD 1.4 billion.



In terms of database market distribution, Oracle, Microsoft, IBM, SAP and Teradata account for 80% of the traditional five database vendors, and cloud database accounts for nearly 10% of the market, and the market share of cloud database is also growing rapidly every year. Therefore, Oracle, MongoDB, etc., are also vigorously layout their competitive situation in the cloud database market.

According to DB-Engines database market analysis, database systems are evolving from traditional TP relational databases to today’s multi-source heterogeneous databases. At present, the mainstream position is still familiar database systems, such as commercial databases Oracle, SQL Server and open source MySQL, PostgreSQL and so on. Some newer database systems, such as MongoDB and Redis, have opened up a new track. While the traditional way of selling database licenses is on the decline, the popularity of open source and cloud database licenses is increasing.

Database: a critical link in cloud applications

As AWS founder Jeff Bezos said, “The real battle will be in Databases.” Because the cloud was first made from IaaS, from virtual machines, storage, network, to now in full swing speech recognition, computer vision and robotics and other intelligent applications, are based on IaaS, and database is the most critical link between IaaS and intelligent application SaaS. From data generation, storage and consumption, databases are vital.

Database mainly includes four sections, namely OLTP, OLAP, NoSQL and database services and management tools, which are also the four directions of cloud database manufacturers. For OLTP, the technology has been around for 40 years, and one of the things people still do is “add 10 and subtract 10”, which is called transaction processing. OLAP grew out of the need to analyze data online in real time as the volume of data became larger and larger and due to read-write conflicts. Because Scale out is required and strong data consistency cannot be guaranteed, NoSQL comes into being. A new term, NewSQL, has recently been coined by integrating the ACID guarantee of traditional OLTP with NoSQL’s Scale out capabilities.

Database system architecture evolution: All Depends on what is shared

Throughout the 40 years of database development history, from the earliest relational database period, derived SQL, OLTP and other technologies; When the amount of Data increases rapidly, it is necessary to avoid reading and writing conflicts. OLAP is implemented by ETL, Data warehouse, Data Cube and other technologies. Today, in the face of heterogeneous multi-source data structure, from graph to time sequence, space-time to vector, etc., NoSQL, NewSQL and other databases have been born, and some new technologies have also emerged, such as Multi-Model and HTAP.

The most mainstream architecture of database system is Shared Memory: Shared processor kernel, Shared Memory and Shared local disk. Such stand-alone architecture is a very mainstream architecture, and traditional database manufacturers basically adopt such architecture.

However, with the large-scale development of Internet companies, such as Google, Amazon and Alibaba, they found that the original stand-alone architecture had many limitations, and its scalability and throughput could not meet the needs of business development. Therefore, the Shared Disk/Storage architecture was derived, that is, Shared Storage architecture. That is to say, the underlying database may be distributed storage, using fast networks like RDMA to make the upper database kernel look like it is using local disks, but it is actually distributed storage. There can be multiple independent computing nodes on it. Generally, it is write and read, but it can also do multiple write and read. This is the shared storage architecture, among which the typical representative is the POLARDB database of Aliyun.

Another architecture is Shared Nothing. Although shared storage has many advantages and solves many problems, RDMA network also has many limitations, such as performance loss across switches, even across AZs and regions. When distributed shared storage reaches a certain number of nodes, there will be certain performance loss, so the performance of accessing remote data and accessing local data cannot be guaranteed to be the same. Therefore, the architecture of shared storage reaches the upper limit of scale out expansion when it is extended to more than ten nodes. At this point, what if the application needs to continue to scale? Then you need to implement distributed architectures, notably Google Spanner, which uses atomic clock technology to achieve data consistency and transaction consistency across data centers. In Aliyun, the distributed version polarDB-X based on POLARDB adopts Shared Nothing architecture.

One thing to note here is that Shared Nothing and Shared Storage can be combined. You can do Shared Nothing at the top and Shared Storage for Shard shards at the bottom. The benefit of this hybrid architecture is that it alleviates the pain point problem of dividing too many shards and reduces the probability of distributed transactions being distributed commit, which is very expensive.

To sum up the three architecture designs, if the Shared Storage architecture can do more write and read rather than write and read, it can actually implement SharedEverything. Hybrid architecture combining Shared nothing and Sharedstorage architecture should be an important breakthrough in the future development direction of database system.

Cloud native database core four elements

Above, the mainstream database architecture in the cloud era is analyzed from the perspective of architecture. In addition to architectural differences, the cloud native era is technically different.

Multi-model

The first is multi-model. There are two main types of multi-model, namely north and south. Southbound storage structures are diverse, and data structures can be structured or unstructured, such as graphs, vectors, documents, etc. However, for users, only one SQL query interface or SQL-like interface is provided, and various data lake services are typical in this industry. However, the northbound multimode storage has only one kind, which generally supports structured, semi-structured and unstructured data through KV data storage form, but it hopes to provide different query interfaces, such as SPARQL, SQL, GQL, etc. The industry’s poster child is Microsoft Azure’s CosmosDB.

Database intelligence + automatic control platform

The autonomy of database is also a very important development direction. There are many technical points that can be done from the perspectives of database kernel and control platform. In the part of database autonomy, Alibaba believes that it is necessary to achieve self-perception, self-decision, self-recovery and self-optimization. Self-optimization is relatively simple, which is to use machine learning method in the kernel to optimize. However, self-perception, self-decision and self-recovery are more targeted at the control platform, such as how to ensure instance inspection and how to automatically and quickly repair or switch over when problems occur.

New hardware: integrated design of hardware and software

The third core point of cloud native database is software and hardware integration design. A database is first and foremost a system, and a system needs to be able to use limited hardware resources safely and efficiently. So the design and development of database system must be closely related to the performance and development of hardware, we can not face the change of hardware and adhere to the old database design does not change, such as NVM after it may have some impact on the traditional database design. And the changes brought by the new hardware are also the database system design needs to consider.

The appearance of new hardware or architecture such as RDMA, NVM and GPU/FPGA will provide new ideas for database design.

High availability

High availability is one of the most basic requirements of cloud native, and users on the cloud certainly do not want their services to be interrupted. The simplest solution to HIGH availability is redundancy, which can be implemented at the Table level or Partition level. No matter which one is used, it is basically three copies, or more often four or five copies are required. For example, financial level high availability may require two or three or four centers.

For highly available multiple copies, how to ensure data consistency between copies? There is a classic CAP theory in the database, which results in Consistency, Availability, and Partition Tolerant. At present, everyone’s general choice is C+P. Meanwhile, for A, through the three-copy technology and distributed consistency protocol, A can reach 6 or 7 9’s, which basically achieves 100% CAP.



POLARDB: extreme flexibility + compatibility for mass data and mass concurrency

In front of the database market background and the basic elements of cloud native database, next I will combine Ali Cloud POLARDB and AnalyticDB two database systems, to share the specific implementation of the above technology. POLARDB is aliyun’s cloud native database, currently has very deep technical accumulation. We have published relevant papers in VLDB 2018, SIGMOD 2019 and other international academic conferences, mainly introducing technological innovations in storage engine and other aspects.

POLARDB uses shared storage architecture, one write, many read. The shared storage architecture has several advantages. First, computing and storage nodes can be separated to achieve flexible capacity expansion. Secondly, POLARDB broke through the limitations of MySQL, PG and other databases for single node specifications and scalability, can achieve 100TB storage capacity and each node 1 million QPS performance; In addition, POLARDB provides extreme flexibility and backup and recovery capabilities are greatly improved. In the storage layer, each data block adopts the three-copy high availability technology. Meanwhile, Raft protocol is modified to ensure data consistency between the three-copy data blocks by implementing the parallel Raft protocol, which provides financial high availability. POLARDB can also achieve 100% compatibility with MySQL, PG and other database ecology, which can help users to achieve non-perceptional application migration.



As the bottom layer is shared distributed storage, PolarDB belongs to active-active architecture. The master node is responsible for writing data, and the slave node is responsible for reading data. Therefore, for transactions entering the database, both the master and standby nodes are in Active state. The advantage is that data synchronization between master and slave is avoided through a single physical storage.

Specifically, POLARDB has a PolarProxy, that is, the front of the gateway agent, POLARDB kernel and PolarFS below, the bottom docking is PolarStore, using RDMA network management of the bottom distributed shared storage. PolarProxy will distribute the customer demand and allocate the write request to the master node, while for the read request, it will realize the allocation of the read request according to the load balancing and the status of the read node, so as to maximize the utilization of resources and improve performance as much as possible.

POLARDB shared storage uses distributed + triple copy. The Primary node is responsible for writing, while the other nodes are responsible for reading. The lower layer is PolarStore. Each part has three copies of backup, ensuring data consistency through distributed consistency protocol. The advantage of this design is that storage and computing can be separated, and lockless backup can be achieved, so backup can be second.

POLARDB can achieve rapid scaling in the case of write read. For example, upgrading from a 2-core vCPU to a 32-core vCPU or from two nodes to four nodes can take effect in less than 5 minutes. Another benefit of storage and computing separation is cost reduction, as storage and computing nodes can be flexibly scaled independently.

The following figure shows how POLARDB uses physical logs for continuous recovery. On the left is the architecture of the traditional database. In POLARDB, due to the use of shared storage, the recovery process similar to the traditional database using physical logs can be basically retained. Continuous recovery can be realized through shared storage and transaction Snapshot recovery can be done.



In contrast, if MySQL is the master/standby architecture, first you need to have a logical log and physical log in the master/standby database, then you need to replay the logical log in the master/standby database, and then do the logical log and physical log in the master/standby database. In POLARDB, because it is shared storage, data recovery can be directly achieved through a log, the backup library can directly recover the required data, without the need to replay the logical log of the main library.

Another big advantage of POLARDB write-read clustering is dynamic DDL support. In MySQL architecture, if you want to modify the data Schema, you need to use Binlog to Replay the data to the standby database. Therefore, there is a Blocking stage in the standby database, and it takes some time to Replay the dynamic DDL. Under POLARDB shared storage architecture, all Schema information and metadata are directly stored in the storage engine in the form of tables. As long as the main database is changed, the metadata of the standby database is also updated in real time, so there is no Blocking process.

POLARDB Proxy’s main function is to do read and write separation, load balancing, high availability switchover and security protection, etc. POLARDB is a write multi-read architecture. When the request comes in, it needs to judge the read and write, distribute the write request to the write node, distribute the read request to the read node, and do some load balancing for the read request. This ensures session consistency and completely eliminates the problem of not being able to read the latest data.

Lossless elasticity is one of the modules monitored by POLARDB. Distributed storage needs to know how many disks/chunks are allocated, and POLARDB monitors the amount of chunks that are not used. For example, when the usage is less than 30%, it will automatically expand the capacity in the background, so that the application is basically unaffected and can write data continuously.

For POLARDB, the biggest advantage of the above technology is extreme flexibility. Here we take a specific customer case to illustrate. As shown in the figure below, the red line represents the consumption of offline resources, which the customer would have paid for anyway, and the top line represents the demand for computing resources.



For example, when customers have new products on the market in March and April and promotional activities in May, the computing demand will be very large in these two periods. According to the traditional structure, it may be necessary to expand the capacity to a larger scale before the new product is launched and maintain this level, and then to a higher specification in the later promotion stage, which is very expensive. But if extreme flexibility can be achieved, such as POLARDB’s separation of storage and computing for rapid elastic expansion, users can simply pop the capacity up before the blue box appears, and then pop it down again, which can greatly reduce costs.

In addition to POLARDB, ali Cloud database team has many explorations in other directions.

Distributed polarDB-X: High concurrency + cross-domain high availability support level expansion

If enterprises need extreme Scale out capabilities, users like Alibaba, banks and electric power in traditional industries who have high requirements for high concurrency and massive data support can only support up to a dozen nodes, which is definitely not enough. Therefore, Ali Cloud database team also adopts Shared Nothing for horizontal expansion, combining Shared Nothing with Shared Storage to form PolarDB-X. Polardb-x supports strong consistency of financial data across available regions and has excellent performance in supporting high concurrency transactions with massive data. POLARDB – X inside ali has online applications, storage is used to calculate the separation, hardware acceleration, distributed transaction processing and distributed query optimization technology, successfully supported in scenarios such as double 11 alibaba all business core link database flood peak of the challenge, we will launch a commercial version, please look.



OLAP database benchmarking — AnalyticDB: Real-time and concurrent online analysis of massive data

In addition, in the direction of OLAP analytical database, Ali Cloud database team independently developed a database product — AnalyticDB, which is sold on both public and private clouds of Ali Cloud. AnalyticDB has several core architecture features:

  • A column and column mix engine that supports high-throughput writes and high-concurrency queries;
  • Support mass data processing, for mass data can achieve second level analysis, perfect support for multi-table, Chinese and complex analysis;
  • Using vectorization technology to support the fusion processing of structured data and unstructured data.

Recently, AnalyticDB ranked TPC-DS, in terms of cost performance reached the first in the world, through the TPC official rigorous certification. Meanwhile, a paper introducing the AnalyticDB system will be presented at VLDB 2019. The common application scenario of AnalyticDB is to apply our data transmission and synchronization tool DTS to AnalyticDB for real-time data analysis from OLTP.



Independent database platform: Individualized Buffer Tuning

One of the characteristics of cloud native Database is autonomy. Ali Cloud has an internal Platform called SDDP (Self-driving Database Platform), which will collect real-time performance data of each Database instance and use machine learning method to model for real-time deployment.



The basic idea behind iBTune is that each database instance contains a Buffer Size, which in traditional databases is pre-allocated and cannot be changed. In a large enterprise, a Buffer is a pool of resources and consumes memory, so you want to flexibly automatically allocate the BufferSize in each instance. For example, the database instance of Taobao commodity database does not need such a large Buffer at night, so its Buffer Size can be automatically bounced down and then automatically bounced up in the morning, without affecting its RT. To meet the above requirements and perform automatic Buffer optimization, ali Cloud database team built the iBTune system, which currently monitors nearly 7,000 database instances and can save an average of 20TB of memory through long-term operation. The core technical paper describing the iBTune project was also presented at this year’s VLDB 2019.

Security on the cloud is the key to multiple encryption escort data security

Data security on the cloud is very important content, ali Cloud database team has also done a lot of work in data security. First, the data is encrypted on the drop disk, while the data is stored. In addition, Ali Cloud database also supports BYOK, users can take their own key to the cloud to achieve encryption and transmission level encryption. In the future, the Ali cloud database will also realize full encryption during memory processing, and realize the trusted verification of logs.

Ali Cloud enterprise database cloud service: full operation and maintenance full link layout

Ali Cloud database provides services according to the classification of tool products, engine products and full-process database products of operation control. The following figure shows Aliyun — the commonly used link of cloud database. The offline database is migrated online by DTS tool, and distributed to relational database, graph database and AnalyticDB based on data demand/classification.


Ali cloud database: customer first, all value comes from service users

At present, the rapid growth of POLARDB database has served the leading enterprises in the general industry, Internet finance, games, education, new retail, multimedia and other fields.



And AnalyticDB also has a very outstanding performance in the analytical database market, supporting real-time analysis and visualization applications.



Based on alibaba cloud database technology, Alibaba supports a series of key projects such as City Brain and a large number of customers on and off the cloud. So far, Ali Cloud database has supported nearly 400,000 database instances successfully on the cloud.

Cloud native is the new battlefield of database, it has brought many exciting new challenges and new opportunities to the database industry which has been developing for more than 40 years. Alibaba hopes to push database technology to a higher level together with all technical colleagues in the database industry at home and abroad.


The original link

This article is the original content of the cloud habitat community, shall not be reproduced without permission.