background
In 2010, G7 officially provided SaaS services for fleet management for the logistics and transportation industry. Through continuous innovation, through the integration of soft and hard product technology capabilities, G7 is committed to digitizing every truck and creating a new ecology of intelligent logistics with real-time perception technology. G7 provides customers with a full range of data services, intelligent security and operations management, mobile car management, digital capacity, and value-added services such as ETC, oil and finance.
Currently, the G7 connects 600,000 trucks, travels 65 million kilometers a day (1,625 times around the earth’s equator), has 1.35 billion track points and 22 million vehicle incidents, and is growing rapidly in a straight line. G7 generates more than 2T of vehicle driving, status, consumption and other data every day. The rapidly increasing number of vehicles, data types and complex financial services pose great challenges to the transaction, analysis, expansion and availability of the database.
In a large amount of vehicle information and track-related data business, we currently analyze a large amount of original data through Spark, Hive, etc., and store it in Ali Cloud DRDS to provide basic data interface services externally. Since the amount of data is still large after cleaning, the storage cost is very high with DRDS, and the efficiency is not satisfactory in the face of many OLAP queries.
However, in complex business scenarios such as finance and payment, the challenges of C and P in CAP are faced. In the previous work, due to the problem of writing the peak value of strong consistency transactions, the payment system adopted 2PC+MySQLXA (single MySQL as the participant, the upper layer added Proxy as the coordinator) to complete the scheme of distributed transaction database. However, this scheme is extremely troublesome in Partition. At the same time, operations and risk control systems often need to do relatively complex queries either through MySQL+ETL+OLAP databases (which are costly) or tolerate efficiency issues with queries.
explore
The G7’s technical team has been looking for a database that can solve these problems. To find such a database, you need to meet another requirement in addition to these requirements: maintainability and ease of migration. This requires the database to meet the following requirements:
-
Compatible with MySQL protocol, so that the changes of the database transparent to the upper business, this is very important, do the infrastructure upgrade of the students should be deeply impressed.
-
MySQL supports the master/slave synchronization mechanism, which smoothen and gradually upgrades database changes and reduces change risks.
-
Must be open source. Database stability requires a lot of effort and time, in this process, more or less there are problems. Problems are not terrible, terrible is unable to locate and solve the problem, can only rely on “others”. A BUG in the database can be a minor problem for “someone else” and a major disaster for G7 business. When butts are not on the same bench, we need to have strong control over the database.
-
Open source at the same time, there must be a strong technical team or commercial companies behind the full input. After seeing a lot of “dead ends” and “achievements” of open source projects, only a steady, full-time technical team or commercial company can make the database better and better.
With so many limitations and requirements, TiDB+TiSpark quickly caught our eye and started investigating. Through communication with TiDB technicians, in addition to meeting the above requirements, we agree on the following technical details that we can choose such a solution:
-
The concepts of Server and StorageEngine in MySQL architecture are further loosely coupled, divided into TiDB and TiKV, and the horizontal scalability is further improved.
-
Targeting an open source implementation of Spanner, but not choosing Multi-PaxOS in favor of Raft, which is much easier to understand, implement, and test, made distributed consistency a lot less of a concern.
-
Using RocksDB as the underlying persistent KV storage, the performance and stability of a single machine has been tested.
-
The distributed transaction model based on GooglePercolator has high requirements on network latency and throughput in the deployment of multi-data centers across regions, but we do not have such strong requirements at present.
First experience — Risk control data platform
The risk control data platform is to clean a large number of business data and calculate with a certain degree of complexity to form a customer’s financial data index on the G7 platform, which can be used by subsequent risk control personnel to query the customer’s risk situation and support the operation of relatively complex queries. Due to the large amount of data, the traditional relational database does not meet the requirements of the service in terms of scalability and OLAP processing. At the same time, this business is internal, and it will not affect customers if we are not familiar with it at the beginning. Therefore, we decide to use TiDB for this business. The risk Control data platform started in August 2017 and launched its first version in October 2017, providing services to online users. TiDB RC4 version we used at the beginning was upgraded to pre-GA, and we plan to upgrade to GA version in the near future.
The system architecture is shown below, and the whole process is very simple and efficient.
In the process of use, we still encountered many compatibility problems. To increase our understanding of TiDB, we got in touch with the TiDB technical team and actively participated in the TiDB project, familiarizing ourselves with the code and fixing some compatibility and BUG related issues. Here are some of the problems we solved in practice:
- Fix information_schema. COLUMNS, COLUMN_TYPE does not support UNSIGNED compatibility.
Github.com/pingcap/tid…
- Fix IGNORE keyword compatibility with INSERT, UPDATE, and DELETE.
Github.com/pingcap/tid…
Github.com/pingcap/tid…
Github.com/pingcap/tid…
- Fix PanicBUG in Set and Join.
Github.com/pingcap/tid…
Github.com/pingcap/tid…
- Added support for ONLY_FULL_GROUP_BY for SQL_MODE.
Github.com/pingcap/tid…
There is still an incompatibility with MySQL. If a primary key or unique index conflict occurs in an INSERT statement after the transaction is started, TiDB will not query TiKV to save network overhead with TiKV. Therefore, TiDB will not return a conflict error and will only inform TiKV of the conflict at Commit time. Hopefully those of you who are going to use or follow TiDB will take note. Later we checked with TiDB and the official explanation was that TiDB uses an optimistic transaction model and conflict detection is performed only at Commit time.
In the initial experience, TiDB team was very serious, responsible and quick to help us troubleshoot and solve problems, and provided excellent remote support and operation and maintenance suggestions.
Promotion planning in other business
At the beginning of 2018, the operation and maintenance team conducted a demand communication with each business side, and the business side’s demand for TiDB became stronger and stronger. We are following the path of bringing TiDB play to more scenarios.
-
TiDB is used as the slave library of RDS to migrate read traffic to TiDB;
-
Start with internal services and gradually migrate write traffic to TiDB.
-
Move more OLAP operations to TiSpark;
-
Cooperate to develop TiDB and TiDB peripheral tools.
Get involved in TiDB community Tips
-
Use GDB tools to understand and be familiar with TiDB code structure and logic.
-
Initially select some issues to analyze and try to fix.
-
Use flame charts to focus and optimize performance.
-
If you haven’t read the surrounding papers, try to read them to deepen your understanding of system principles.
-
Actively participate in TiDB community activities, deepen communication with TiDB core r&d.
-
If there are appropriate business scenarios, you can try TiDB more and broaden the application practice of TiDB in different scenarios.
G7 welcomes friends who want to engage in database optimization and development to join G7 and build better NewSQL products together. Please send your resume to [email protected] and we will contact you as soon as possible.
About the author: Liao Qiang, formerly worked for Baidu, responsible for baidu Wallet’s distributed transaction database, infrastructure and checkout desk. Currently, HE is a technical partner of G7 Huitong Tianxia, responsible for financial product development, operation and security.