Ping++ introduction

Ping++ is a leading payment solution SaaS service provider in China. Since the official launch of converged payment products in 2014, Ping++ has been recognized by the majority of enterprise customers with the ultimate product experience of “7 lines of code access to payment”.

Now, Ping++ continues to expand its service scope in the field of pan-payment. It owns three core products, namely converged payment, account system and merchant system, and has solved payment problems for nearly 25,000 corporate customers. It covers retail, e-commerce, enterprise services, O2O, games, live streaming, education, tourism, transportation, finance, real estate and more than 70 segments.

Ping++ has been included in KPMG China’s top 50 Fintech leaders for two consecutive years and CB Insights’ global Fintech 250 in 2017. From payment access, transaction processing, business analysis to business operation, Ping++ provides customized whole-process solutions to help enterprises deal with many problems they may face in the commercial realization link.

Application scenario of TiDB in Ping++ – data warehouse integration and optimization

Ping++ data support system is mainly composed of stream calculation class, report statistics class, log class and data mining class. Among them, the data warehouse system corresponding to the report statistics class carries the real-time summary, analysis and statistics of hundreds of millions of transaction data, stream download and other important services:

With the expansion of business and requirements, the warehouse system has gone through several iterations:

  1. Since most of the correlation dimensions in business requirements are flexible and changeable, the relational database RDS is directly used as the data support at first, and the data is subscribed from OLTP system by the self-developed data subscription platform.

  2. With the expansion of business, a large single table is not enough to support complex query scenarios, so two solutions are introduced to provide data services at the same time: ADS and OLAP solution of Ali Cloud, which are used to solve complex relational multidimensional analysis scenarios. ES, a distributed solution to massive data search scenario.

  3. The above two solutions basically meet the business requirements, but there are still some problems:

    • ADS: First, the stability of data services. Aliyun will officially upgrade the version irregularly, which will lead to the delay of data for several hours, and the real-time services cannot be guaranteed at all. Second, the cost of expansion. ADS pays according to the number of calculated cores. If it expands, it must purchase the corresponding number of cores, so the cost is not flexible and controllable.

    • ES: The search capability is strong for a single service, but not suitable for complex and changing scenarios. And the research and development operation and maintenance cost is relatively high, there is no relational database compatible with all kinds of new business advantages.

Therefore, we need to make further iterative integration. We belong to financial data business, the importance and safety should not be ignored, and the performance should be guaranteed. After a long investigation process, finally, TiDB database developed by PingCAP becomes our target selection.

The following core characteristics of TiDB are the main reasons why we choose it as a real-time data warehouse:

  • Highly compatible with MySQL syntax;

  • Strong ability of horizontal elastic expansion;

  • Massive data processing performance;

  • High availability services for fault recovery;

  • The architecture of financial security levels.

And the following data support system architecture is formed:

The new solution brings the following improvements and changes to our business and management:

  • Compatibility: integration of existing multiple data sources, quick response to new business on-line;

  • Performance: Provides reliable transaction analysis scenario performance;

  • Stability: Higher stability, convenient cluster operation and maintenance;

  • Cost: Resource cost and operation cost are reduced.

TiDB architecture analysis and on-line situation

TiDB is PingCAP’s open source distributed NewSQL database inspired by Google’s Spanner/F1 paper. As you can see from the conceptual model of Google Spanner below, it imagines a database system to fragment data and distribute it into multiple physical zones, a Placement Driver to schedule the data slices, and a TrueTime service to implement atomic schema change transactions. Thus external Clients can provide consistent transaction services. Therefore, a truly global OLTP & OLAP database system is possible.

Let’s analyze the overall architecture of TiDB through the following figure:

It can be seen that TiDB is a perfect practice of Spanner’s concept. A TiDB cluster consists of TiDB, PD and TiKV.

  • TiKV Server: responsible for data storage, is a distributed key-value storage engine that provides transactions.

  • PD Server: responsible for management scheduling, such as data and TiKV location routing information maintenance, TiKV data balancing, etc.

  • TiDB Server: responsible for SQL logic, through PD addressing to the TiKV location of the actual data, SQL operations.

Production cluster deployment:

It has been running stably for several months, and the corresponding analysis performance of complex reports has been greatly improved. After replacing ADS and ES, a lot of operation and maintenance costs have been reduced.

Future plans for TiDB in Ping++

  1. The experience of TiSpark

TiSpark is an OLAP solution that runs Spark SQL directly on the distributed storage engine TiKV. The next step is to evaluate more complex scenarios with higher performance requirements in conjunction with TiSpark.

  1. OLTP scenarios

At present, the data of TiDB is derived from the subscription platform subscribing to RDS and DRDS data, so the system complexity is high. TiDB has excellent distributed transaction capabilities, fully at the level of HTAP.

TiKV based on Raft protocol for replication, to ensure the consistency of multiple copies of data, can kill the current mainstream MyCat, DRDS distributed architecture. In addition, the availability of the database is higher. For example, we have upgraded disks (Case records) of all hosts in the production TiDB cluster, which involves data migration and restart of each node. However, we have achieved zero awareness of relevant business, simple operation and controllable process, which cannot be easily realized in the traditional database architecture.

We plan to have TiDB gradually host some OLTP services.

Suggestions and official replies to TiDB

  1. DDL optimization: At present, TiDB implements online DDL without blocking, but it is found in the actual use that a large number of index KV is generated during DDL, which will cause the load of the current host and increase the performance risk of the current cluster. In fact, in most cases, the large table DDL is not very frequent, and the time requirement is not particularly strong, consider safety. Recommended optimization points:

    • Whether the fixed value of defaultTaskHandleCnt and defaultWorkers variables in the source code can be made into configuration items to solve;

    • Is it possible to add pause in DDL process like pT-OSC tool?

  2. DML optimization: Improper SQL usage may inevitably occur on the service end. For example, full table scan may affect the performance of the entire cluster. In this case, can you add a self-protection mechanism, such as resource isolation and fusing?

For the above problems, we have also consulted the official technical personnel of TiDB, and the official reply is as follows:

  • The Add Index operation process is being optimized to lower the priority of the Add Index operation to ensure the stability of online services.

  • The ability to dynamically adjust the concurrency of Add Index operations is planned for version 1.2.

  • A DDL pause feature is planned for a future release.

  • For full table scan, the low priority is adopted by default to minimize the impact on point-to-point search. In the future, a user-level priority will be introduced to separate Query priorities of different users, reducing the impact of offline services on online services.

Finally, we would like to thank all PingCAP team members for supporting all aspects of Ping++ online TiDB!

✎ author: song tao Ping++ DBA