I. Project background

Order system exists in all walks of life, such as e-commerce orders, bank statements, operators’ phone bills, etc. It is a very broad and universal system. For this type of system, the development of the past decade has become a classic practice. However, with the development of the Internet and enterprises’ emphasis on data, more and more orders need to be stored and persisted. The importance of data and the expansion of data scale bring new challenges.

Demand scenarios

An e-commerce platform A needs to persist the order data generated by all platforms. At the same time, based on all the order data, the system needs to provide a variety of inquiry services for a variety of roles: consumers, shop owners, platform three groups. Consumers can query their historical orders, merchants can count hot products, and the platform can analyze user behavior and transaction scale on the platform. The main query methods cover multidimensional order retrieval, as well as order data analysis, statistics, etc., such as: consumer-oriented: [A consumer] * [nearly 1 year] * [sell computer] order query; Salespersons: [B salespersons] * [recent 1 month] sales orders; .

Technical point

In the order scenario, technical points usually need to be considered mainly include the following aspects:

Query ability: need to have a variety of query types, such as multidimensional, range, fuzzy query and so on, with sorting, statistics and other functions;
Data volume: It can store massive data and meet the requirements of strong consistency, high availability and low cost.
Service performance: To deal with high concurrency requests and high concurrency, while ensuring low latency;

Realize multi-dimensional and real-time query function, which is the core function of order management solution.Project sample

Ii. Program evolution

For order scenarios, e-commerce usually adopt MySQL’s traditional solution. With the help of the powerful query ability of relational database, users can directly achieve multi-dimensional query and data statistics of order data through SQL statements. The so-called data inflation can be divided into horizontal and vertical dimensions. Horizontal dimension refers to the new field dimension introduced by continuous iteration, and vertical dimension refers to the total amount of stored data. In the face of these two types of order data bloat, the single MySql scheme becomes increasingly difficult. The combination scheme of SQL + NoSQL (hereinafter referred to as the combination scheme) came into being, with the help of the respective advantages of the two databases to solve the respective needs of different scenarios. However, the combined scheme also brings new problems. The combined scheme sacrifices space cost and increases development workload and operation complexity. Additional overhead is incurred in ensuring data consistency. Let’s take a look at the following general scenarios:

Conventional scheme

1. MySql database and table scheme

MySql itself has powerful data query and analysis functions. It can create an order system based on MyQql, which can deal with multi-dimensional query and statistical scenarios of order data. With the increase of order data volume, users will adopt separate database and separate table solutions to solve the problem caused by data expansion through this pseudo-distributed solution. However, once the data reaches the bottleneck, it is necessary to re-create a larger scale branch library + full migration of data, and trouble will continue to appear. Data iteration and expansion are difficult to overcome in MySql scheme. Just rely on MySql traditional order scheme shortcomings highlighted. 1. Longitudinal data expansion (data scale) : The scheme of database and table is adopted. MySql needs to estimate the database scale during the deployment. 2. Data horizontal (field dimension) expansion: Schema needs to be predefined, and new fields are added in iteration and change is complicated. When the dimension reaches a certain amount, the database performance will be affected.

2. MySql+HBase solution

The introduction of dual data scheme arises at the historic moment. The problem of data volume expansion can be solved to a certain extent through the scheme of real-time data and historical data storage. The scheme classifies data into two parts: real-time data and historical data. At the same time, expired data can be synchronized to historical data through the data synchronization service. 1. Real-time order data (for example, orders in the last 3 months) : Real-time orders are stored in MySql database. The speed of the total expansion of real-time orders is limited, and the multi-dimensional query and analysis of real-time data are ensured. 2. Historical order data (for example, orders made three months ago) : The historical order data is stored in HBase, and the distributed NoSql database of HBase effectively solves the problem of order data inflation. It also ensures the persistence of historical order data; However, this scheme sacrifices the use value of historical order data to users, merchants and platforms, and assumes that the demand frequency of historical data is extremely low. But once there is a demand, it needs to scan the full table, query speed is slow, I/O cost is high. However, maintaining data synchronization brings problems such as data consistency and soaring synchronization operation and maintenance costs.

MySql+Elasticsearch

MySql+Elasticsearch also stores data in two parts, which can solve the problem of order index dimension growth to a certain extent. Users maintain data synchronization services to ensure data consistency between the two parts. 1. Full data: The full order data is stored in the MySql database, and the data other than the order ID is stored as a whole field. The full data is stored as a persistent data store and is also used for backlookup of non-index fields. Select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database), select * from Elasticsearch (based on Lucene distributed index database); This scheme dealt with the problem of data dimension expansion, but with the continuous expansion of orders, the problem of poor MySql scalability was exposed again. At the same time, the data synchronization scheme to Elasticsearch has high development, operation and maintenance costs, and there are drawbacks in the scheme selection.

Ability to analyze	MySql	HBase	Elasticsearch	TableStore
storage	Line storage	Column storage	Indexes are stored	Column storage + index storage
scalability	Single machine, poor scalability	Horizontal scaling	Horizontal scaling	(Automatic) horizontal scaling
consistency	Strong consistency	Strong consistency, timing consistency		Strong consistency, timing consistency
retrieve	Weak support	Does not support	support	support
The amount of data	~ 1T, ~ billion lines	~10 PB, ~ trillion rows	~1 PB, ~ 100 billion rows	~10 PB, ~ trillion rows

TableStore scheme

If the use of TableStore research and development of multiple index (SearchIndex) scheme, it can be a perfect solution to order system problems of hundreds of millions of orders. TableStore has the characteristics of ready-to-use, charge by quantity and so on. Multiple indexes can be created at any time, which is an excellent solution for metadata management of massive e-commerce orders. TableStore, as a fully hosted, distributed NoSql type data storage service provided by Ali Cloud, has the functions of [massive data storage], [hot data automatic fragmentation], [massive data multidimensional retrieval] and so on, which naturally solves the challenge of big explosion of order data; At the same time, SearchIndex provides multi-dimensional data search and statistics capabilities on the basis of ensuring high availability of user data. Create a variety of indexes for a variety of scenarios to achieve a variety of pattern retrieval. Users can create and open indexes only when needed. The TableStore ensures the consistency of data synchronization, which greatly reduces the workload of users’ scheme design, service operation and maintenance, code development and so on.

Order system page overview based on table storage

The sample is embedded in the console of table storage. Users can log in to the console to experience the system. (If you are a new user of table storage, you need to click Open service to experience the system. Note: This sample provides order data on the order scale of [million]. Official console address: project example

Two, construction preparation

If you have a good experience of order system of 100 million magnitude and want to start your own system construction journey, just follow the following steps to set up:

1. Enable table storage

The form storage service is opened through the console. The form storage is out-of-the-box (postpaid), and the free amount of the function test is provided to the users by the pay-per-quantity method. Form storage official website console, free limit description.

2. Create an instance

Use the console to create a table storage instance and select a Region that supports multivariate indexes. (SearchIndex has not been commercialized yet. It will be available in Beijing, Shanghai, Shenzhen and Hangzhou for the time being, and will be available gradually in the future)

After creating an example, submit work order to apply for multi-index function invitation test (open by default after commercialization, no charge for use).

Invitation test address: Submit work order, choose “Form storage” > “Product function and feature Consultation” > “Create work Order”, the application content is as follows:
Problem description: Please fill in the application for SearchIndex Test
Confidential Information: Please fill in [region + instance name], for example: Shanghai +myInstanceName

3. SDK download

Using the SDK with multi-index (SearchIndex), the official website, temporarily added new features to the Java, Go, node.js SDK

java-SDK

< the dependency > < groupId > com. Aliyun. Openservices < / groupId > < artifactId > tablestore < / artifactId > < version > 4.7.4 < / version > </dependency>Copy the code

go-SDK

$ go get github.com/aliyun/aliyun-tablestore-go-sdkCopy the code

4. Table design

The order system is not only an order table, it should include: consumer table, sales table, product table, supplier table, transaction order table, payment order table and so on. In this example, the pork loin uses the basic four tables (consumer table, salesman table, product table, and trade order table). The order table is just as follows: Table name: order_contract

The column name	The data type	The index type	Fields that
_ID (primary key column)	String		MD5(oId) To avoid hotspots
oId	String	KEYWORD	The order no.
pName	String	TEXT	Product name. Index of type TEXT can be fuzzily queried but cannot be sorted
totalPrice	double	DOUBLE	The order price
orderTime	long	LONG	Order time (timestamp)
.	.	.	.

Iii. Start building (core code)

1. Create table

Four tables: order table, customer table, sales table, and product table Users only need to maintain one instance and create it as follows: Create and manage data tables on the console (users can also create data tables using the SDK directly) :

SearchIndex = SearchIndex = SearchIndex = SearchIndex = SearchIndex = SearchIndex = SearchIndex

3. Data import

Insert some test data (100 million data are inserted in the console sample, and users can insert a small amount of test data themselves through the console);

The order number	Order (MD5) (primary key)	Consumer number	Consumer name	Shop assistant No.	Salesman’s name	Product number	Product name	Product brand	The product type	Place the order of time	Payment time	Payment status	The product is monovalent	The number of	The total price
o0000000000	c49f5fd5aba33159accae0d3ecd749a7	c0019	Chen elimination	s0020	The sale of chu ten	p0003004	vivo x21	vivo	Mobile phone	The 2018-07-17 21:00:00		no	2498.99	2	4997.98

Consumer Id (Primary Key)	Consumer name	Consumer points	Registration time
c0001	Zhao a elimination	818	The 2018-07-07 14:33:51

Shop Assistant NO. (Primary key)	Salesman’s name	Salesman’s score	Date of entry
s0001	Sell a zhao	613	The 2018-07-07 14:27:59

Product Number (Primary key)	Product name	Product brand	The product type	The product is monovalent	The new time
p0001001	iphone 6	apple	Mobile phone	6969.00	The 2018-07-07 14:44:39

4. Data reading

Data reads fall into two categories:

The primary key to read

Obtain primary key columns based on native table storage: getRow, getRange, batchGetRow, etc. Primary key read for index (automatic) backlookup, users can also provide primary key (order MD5) single query page, query speed is very fast under the order of billions. Single primary key query does not support multi-dimensional retrieval.

The index reading

Query: Search interface based on the new SearchIndex feature. Users are free to design multi-dimensional condition combination queries for index fields. By setting and selecting different query parameters, different query conditions and different sorting methods are constructed. Currently support: precise query, range query, prefix query, matching query, wildcard query, phrase matching query, word string query, and through Boolean and, or combination. For example, the combination of [C0001 consumer, order with consumption above 99.99] is as follows:

List<Query> mustQueries = new ArrayList<Query>();

TermQuery termQuery = new TermQuery();
termQuery.setFieldName("cId");
termQuery.setTerm(ColumnValue.fromString("c0001"));
mustQueries.add(termQuery);

RangeQuery rangeQuery = new RangeQuery();
rangeQuery.setFieldName("totalPrice"); RangeQuery. SetFrom (ColumnValue. FromDouble (99.99)); mustQueries.add(rangeQuery); BoolQuery boolQuery = new BoolQuery(); boolQuery.setMustQueries(mustQueries);Copy the code

Welcome aboard

In this way, the core code of the system has been completed, based on table storage to build an order system, is not very simple?

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

TableStore based hundred million order management solution

I. Project background

Demand scenarios

Technical point

Ii. Program evolution

Conventional scheme

1. MySql database and table scheme

2. MySql+HBase solution

MySql+Elasticsearch

TableStore scheme

Order system page overview based on table storage

Two, construction preparation

1. Enable table storage

2. Create an instance

3. SDK download

java-SDK

go-SDK

4. Table design

Iii. Start building (core code)

1. Create table

3. Data import

4. Data reading

The primary key to read

The index reading

Welcome aboard

TableStore based hundred million order management solution

I. Project background

Demand scenarios

Technical point

Ii. Program evolution

Conventional scheme

1. MySql database and table scheme

2. MySql+HBase solution

MySql+Elasticsearch

TableStore scheme

Order system page overview based on table storage

Two, construction preparation

1. Enable table storage

2. Create an instance

3. SDK download

java-SDK

go-SDK

4. Table design

Iii. Start building (core code)

1. Create table

3. Data import

4. Data reading

The primary key to read

The index reading

Welcome aboard

Related Posts

Small white learning programming, C++ initialization pit, have you avoided?

Strong weak virtual references in Java

How does the message queue thread pool model ensure message loss during restart