Experience of using cloud native Serverless Database

Author: Li Xin

With the rapid development of Internet technology in the past decade, more and more industries have joined the Matrix of The Internet, which has brought more rich and complex business scene requirements, which is undoubtedly a huge challenge to the performance of data application system.

Relational database MySQL is the most widely used database product in application systems, with powerful data query and strong transaction processing capabilities. In today’s cloud era, application systems are gradually evolving to be built based on cloud native Serverless architecture, because of its advantages of low cost and high flexibility. However, there are still some obvious deficiencies in mysql-based data storage under Serverless architecture:

Poor elastic expansion capability. An important feature of the Serverless scenario is the significant peaks and troughs of application load. When faced with the traffic peak, the DBA needs to manually expand the cluster to prevent the cluster from being overwhelmed. When traffic is at a low ebb, the cluster needs to be scaled down to save costs.
High o&M complexity. To build MySQL, you need to purchase clusters, install services, and manage connections. After services are online, they need to pay attention to data security, service availability, and response time. As a result, they spend more time on cluster O&M and cannot focus on service research and development.
The high cost. Generally, DBAs need to estimate the service scale to set the initial database capacity in advance. If the number of service requests does not reach the estimated value, resources in the cluster will remain idle, resulting in resource waste.

Serverless DataBase

MySQL supports the relational model and its strong transaction characteristics, so that it has a very important position in the application system, is one of the storage components that can not be completely replaced. However, if you blindly rely on MySQL, the application system cannot be completely Serverless and cannot enjoy the extreme flexibility brought by Serverless.

Within Ali, we have some new architectural practices. The data that requires strong transaction processing still uses relational table storage, while for the non-strong transaction table data storage, we have designed a Serverless table storage with extreme flexibility.

Our design requirements for the Serverless database product include the following features:

Completely elastic. The capacity can automatically expand and shrink flexibly according to the application load, which provides users with a more economical billing mode and a smoother experience.
Pay-as-you-go. The cost of using Serverless databases mainly comes from computing cost and storage cost. Users only need to pay for storage units and response units actually generated by services, saving costs.
Zero operations. Out-of-the-box, there is no need to manage capacity, water level, software upgrade, kernel optimization and other operations and maintenance matters, truly enabling R&D to focus on business development.

The Serverless architecture is widely used in many business scenarios. For example, in its core e-commerce business, Century Lianhua Group realizes full cloud operation and gradually transforms its business into the middle-stage mode with full Serverless architecture in view of the pain points such as difficult budget of resources and difficult system deployment encountered in self-built IDC room.

Century Lianhua Group adopted the scheme of function calculation + API gateway + Tablestore, which easily supported the big promotion activities such as 6.18 and Double 11. Among them, table storage Tablestore, as the core storage in Serverlesss architecture on cloud of Century Lianhua e-commerce system, has the advantages of extreme flexibility, free operation and maintenance and low cost.

Table storage Tablestore introduction

Table storage Tablestore in 2009 ali Cloud was established at the beginning of the project research and development, based on the bottom feitian platform built from scratch, is a multi-model, multi-engine Serverless table storage. It has exported more than 30 regions at home and abroad on the public cloud, with a scale of 15,000 servers and 200PB storage, which is the underlying core storage of many commercial products of Ali Cloud.

At the same time, it has been exported to finance, energy, power, logistics, medical, government and enterprise industries offline, serving public cloud 1000+ enterprise customers and 500+ offline projects.

Table Storage Tablestore integrates HBase and ElasticSearch, provides extreme flexibility, free o&M, and out-of-the-box features, supports flexible storage from GB to PB, and insensitive expansion of 100,000-level TPS service capabilities. As a one-stop structured data storage platform integrating storage, search and analysis, it not only supports massive table data but also provides rich data retrieval and analysis capabilities.

The overall structure of Tablestore Tablestore is shown in the figure below:

Tablestore architecture diagram

Table storage provides a variety of data models, including wide table model (Widecolumn), message model (Timeline) and Timeseries model (Timeseries).

The wide table model mainly carries table structured data storage, such as e-commerce order data.
The message model mainly hosts message data stores, such as IM/Feeds messages.
Timing model mainly carries timing data storage, such as timing data of Internet of Things devices.

Here we will take the e-commerce order scene as an example, take you to experience the tableStore-based wide table model to build a Serverless order storage system.

Tablestore experience

The preparatory work

Before you can experience the extreme flexibility of Tablestore, you need to prepare the following steps:

(1) Create an Ali Cloud account and obtain the AK of the Ali Cloud account. (Cloud account AK is the key to access all cloud services including Tablestore, and AK is required to access Tablestore service in the future).

(2) Download and start the command line tool Tablestore CLI provided by Tablestore. The command line tool provides some simple instructions to manage table storage service.

First, configure the connection key through the config command and open the Tablestore service through the enable_service command:

config --id accessKeyID --key accessKeySecret
enable_service
Copy the code

Create an instance with create_instance:

create_instance -d "order storage" -n serverless-db -r cn-hangzhou
Copy the code

Instance is equivalent to the concept of MySQL database. After creating an instance, you do not need to worry about the water level of the physical machine cluster where the instance resides. You only need to focus on developing service logic. At the same time, the read/write and storage on the instance are charged by volume. If there is no read/write and no storage, there is no actual cost.

At this point, a Serverless DataBase that supports GB to PB storage, has no concurrency restrictions, zero operation and maintenance, and is fully resilient has been created.

Create a table

Widecolumn is a Schema-free data table. Different from the relational database MySQL, the Widecolumn data table only needs to define the primary key structure, and does not need to define the attribute column structure.

For example, the structure of an order table order is as follows:

Create an order table with a wide table model, specify the primary key ID of the order table instead of the attribute column.

create_instance -d "order storage" -n serverless-db -r cn-hangzhou
Copy the code

After the create command is executed, an order width table is successfully created. The newly created order width table initializes one data partition.

As the order data volume increases or the number of visits increases, the wide table model will split and expand into multiple data partitions according to the distribution range of the first primary key (namely, the order ID in the above data model) and evenly distribute to multiple physical machines to support larger data scale (TB or PB) and read/write throughput (more than 100,000 TPS). The entire extension process is completely automated by the server without human intervention.

Data import

The simulation generated 1 million sample order data and imported them into the ORDER table in batches through the import command. The write speed of a single data partition can reach tens of thousands of rows /s, and the write throughput can be further improved as the partition expands.

import -i orderDataFile -l 1000000
Copy the code

Current speed is: 10000 rows/s. Total succeed count 10000, failed count 0.
Current speed is: 12600 rows/s. Total succeed count 22600, failed count 0.
......
Current speed is: 9200 rows/s. Total succeed count 1000000, failed count 0.
Import finished, total count is 1000000, failed 0 rows.
Copy the code

Order query

Use the get command to query the wide table model for a single row by order number (ID) and get a single row of order data. The GET command can only perform a single row query based on rowKey.

Example of querying an order:

id = “0000005be2b43dd134eae18ebe079774”

get --pk '["0000005be2b43dd134eae18ebe079774"]
Copy the code

+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+--------+---------+-------+-------+--------+------------+ | order_id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | sId | sName | totalPrice | +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- + -- -- -- -- -- -- -- - + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + + -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- + | 0000005 be2b43dd134eae18ebe079774 | c0015 | | | false on Friday O0035062633 | 3 | | 1507519847532 | millet p0005003 6 2299.21 | | | millet mobile phone | s0017 | sale zheng seven | | 6897.63 +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+--------+---------+-------+-------+--------+------------+Copy the code

Order retrieval and statistics

Order scenarios often rely on multi-condition combination filtering, in which case it is necessary to rely on the multi-index feature of Tablestore. Multiple index is a table data index provided by Tablestore similar to Elasticsearch. It supports rich query methods and data aggregation capabilities, and can create indexes on multiple columns. Unlike MySQL’s federated indexes, multivariate indexes can be queried based on any combination of fields and will not be matched by the leftmost prefix of multiple columns.

For example, we build indexes on fields such as ID, pName and totalPrice respectively, and use data structures such as inverted index, word segmentation and BKDTree to provide query capabilities such as accurate query, full-text search and range query. In addition, multivariate indexes support field grouping, multi-field sorting, and statistical aggregation capabilities.

Create a multivariate index on a wide table with create_search_index to speed up queries.

create_search_index -t order -n order_index { "IndexSetting": null, "FieldSchemas": [{ "FieldName": "id", "FieldType": "KEYWORD", "Index": true, "EnableSortAndAgg": true, "Store": true },{ "FieldName": "pName", "FieldType": "TEXT", "Index": true, "EnableSortAndAgg": false, "Store": true },{ "FieldName": "totalPrice", "FieldType": "DOUBLE", "Index": true, "EnableSortAndAgg": true, "Store": true}...Copy the code

Tablestore supports SQL query ability, compatible with MySQL query syntax, and tries to retain the use of relational database habits. SQL can automatically select indexes and perform query acceleration. Through the query acceleration of multivariate indexes, it also has the ability of millisecond delay query under the scale of ten billion data.

Order retrieval according to sName, pBrand and pName three field conditions:

Select * FROM 'order' WHERE sName = "sbrand" and pBrand = "sbrand" AND pBrand = "sbrand" LIMIT 3;Copy the code

+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+---------+--------+-------+---------------+-------+--------+------------+ | id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | payTime | sId | sName | totalPrice | +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+---------+--------+-------+---------------+-------+--------+------------+ | 00001c760c04126da067e90409467c4e | c0022 | Zhao a elimination | true | o0009999792 | 3 | | 1494976931954 | millet p0005004 | red rice 5 s 499.01 cell phone | | 1494977189780 | | s0005 | | sale Friday 1497.03 | +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+---------+--------+-------+---------------+-------+--------+------------+ | 0000d89f46952ac03da71a33c8e83eef | c0012 | Away money 2 | | false o0024862442 | 2 | | 1502415559707 | millet p0005004 | red rice 5 s | | 499.01 phone | null | s0015 | sales | | 998.02 on Friday +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+---------+--------+-------+---------------+-------+--------+------------+ | 0000f560b62779285e86947f8e8d0e4c | c0008 | Von eight elimination | | false o0000826505 | 1 | | 1490386088808 | millet p0005004 | red rice 5 s | | phone | null | 499.01 s0015 | sales | | 499.01 on Friday +----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+--------- -+---------+--------+-------+---------------+-------+--------+------------+Copy the code

Count the order quantity of each brand in all orders:

select pBrand,count(*) from `order` group bypBrand;
Copy the code

+ + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | pBrand | count (*) | + -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | vivo | 162539 | + -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | | 304252 lenovo | + -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | oppo | 242513 | + -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- + | | 96153 | apple + + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- + | 194543 | | millet +--------+----------+Copy the code

conclusion

As a widely used Serverless DataBase, Tablestore provides economic billing mode and can greatly reduce business costs. The above order scenario is an example. Under the order data level of 100 million and the average read and write volume of 2000TPS, the cost of using tables to store Tablestore is less than 400 yuan/month. At the same time, Tablestore has extreme elastic service ability and completely zero operation and maintenance characteristics, which can bring users a more silky experience.

If you have any questions about this article or want to know more about table storage, you can search the group number: “23307953”, the group provides free online expert services, welcome to join.

Click here to view the form storage details!