With the arrival of spring 2021 and PingCAP’s sixth birthday, TiDB 5.0 is officially GA. After nearly a year of intensive development and polish, TiDB 5.0 has become a milestone release towards enterprise core scenarios. TiDB 5.0 has significantly improved performance and stability, resulting in stronger service capabilities for OLTP financial core scenarios. Based on the original HTAP engine TiFlash, THE MPP architecture is introduced. TiDB makes real-time/interactive BI of many enterprises a reality, provides a stack of data service base for high-growth enterprises and digital innovation scenarios, and accelerates HTAP into the digital scenarios of more large enterprises.

In addition, TiDB 5.0 has a number of new enterprise-level features that integrate a richer big data ecosystem and provide a simpler operation and maintenance experience to help enterprises build and scale applications based on TiDB more efficiently. Adhering to the development path of open source, open and innovative, TiDB will continue to build a “integrated, simplified and reliable” distributed database platform for enterprises.

High performance: breakthroughs in a number of performance indicators

Compared to TiDB 4.0, TiDB 5.0 is a huge improvement in performance, achieving 20% to 80% performance improvements in OLTP benchmarks such as Sysbench and TPC-C by providing clustered indexing, asynchronous commit transactions and other features. Here is the data for some common performance test scenarios:

Configuration information

Component name Configuration information The number of
PD m5.xlarge (AWS) 3
TiDB C5.4 xlarge (AWS) 3
TiKV I3.4 xlarge (AWS) 3

Load information

16 tables, each with 10 million rows of data

The performance data

In addition, TiDB 5.0 also has architectural improvements in AP scenarios, providing TiFlash MPP calculation model, based on tPC-H query results show: TiDB 5.0 overall MPP engine performance is two to three times that of Greenplum 6.15.0 and Apache Spark 3.1.1 for the same resources, and up to eight times performance difference for some queries. Hardware specifications and test details of TiDB MPP test under TPC-H 100 can be found here.

Stability: the standard deviation of TPC-C QPS jitter is less than or equal to 2%

Compared to TiDB 4.0, TiDB 5.0 is a huge improvement in stability. By optimizing the usage of I/O, network, CPU, and memory resources during TiDB scheduling, the performance jitter caused by resource preemption and delay is greatly reduced. In tPC-C OLTP benchmark test, the standard deviation of TPC-C QPS jitter is less than or equal to 2%.

Configuration information

Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz 40Core 189GB Memory 3TB SSD

Ease of use: more accurate, more efficient, more comprehensive

More accurate performance screening

When you troubleshoot SQL statement performance problems, you need detailed information to determine the cause. TiDB 5.0 allows you to view detailed information, such as logs and monitoring information, through EXPLAIN, helping you improve troubleshooting efficiency.

More efficient cluster o&M

In TiDB 5.0, TiUP supports one-click environment checks and recommended fixes, automatically fixes environmental problems found during the checks, optimizes the operational logic for multiple cluster deployments, and makes it easier for DBAs to deploy standard TiDB production clusters more quickly. The new version of TiUP provides service insensitive upgrade experience, and the performance jitter is controlled within 10 to 30 seconds during the upgrade.

During the TiDB upgrade, the function of automatically capturing and binding the query plan is added. The system automatically captures and binds the latest query plan and stores it in the system table to ensure that the SQL in the upgrade still follows the bound execution plan.

Ensure stability. After the upgrade is complete, the DBA can export the query plan for the binding, analyze it, and decide whether to delete the binding.

More comprehensive SQL tuning

TiDB 5.0 supports Invisible Indexes. When debugging and selecting optimal Indexes, dbAs can set an index to Visible or Invisible using SQL statements to avoid performing resource-consuming operations, such as: DROP INDEX or ADD INDEX.

During TiDB 5.0 performance tuning or operation and maintenance, users can select optimized SQL statements according to actual needs or based on EXPLAIN ANALYZE tests, and bind the optimized SQL statements to the SQL statements executed by business code through SQL BINDING. Ensure stability.

Data ecology: Multiple data migration, data import and data sharing components are added to facilitate users to use TiDB in heterogeneous environments

Data migration

Data migration tools support AWS S3 (and other storage services that support S3) as the intermediate dump medium for data migration, and can directly initialize Aurora snapshot data to TiDB, which enriches the choice of data migration from AWS S3/Aurora to TiDB.

Data import tool TiDB Lightning optimizes data import performance for TiDB clusters configured with DBaaS AWS T1.standard (and equivalent). Test results show that using TiDB Lightning to import 1TB TPCC data into TiDB, the performance is improved by 40%, from 254 GiB/h to 366 GiB/h.

Data sharing

TiCDC integrates Kafka Connect (Confluent Platform) to synchronize TiDB data changes to different relational or non-relational databases using the Kafka Connectors protocol. For example: Kafka, Hadoop, Oracle, etc., help enterprises to transfer business data to heterogeneous databases, forming a data loop.

TiCDC supports data replication between multiple TiDB clusters and can be used for data backup, disaster recovery, and data aggregation among multiple TiDB clusters.

Enterprise features: Enterprise features get overall enhancements

affairs

In pessimistic transaction mode, if a table involved in a transaction has concurrent DDL operations or SCHEMA VERSION changes, the system automatically updates the SCHEMA VERSION of the transaction to the latest VERSION to ensure successful transaction submission.

High availability and Dr

TiDB 5.0 introduces Raft Joint Consensus algorithm to combine “Add” and “delete” in Region member change operations into one operation and send it to all Region members to improve the availability of Region member change. During the change process, a Region is in an intermediate state. If any modified member fails, the system is still highly available.

Safety compliance

To meet the requirements of enterprise security compliance, such as the General Data Protection Regulation (GDPR), TiDB supports desensitization of sensitive information (such as ID card information and credit card number) when outputting error information and log information.

In addition, the new version further optimizes the memory management module to track the memory usage of the aggregation function, thus reducing the risk of OOM. In terms of SQL functionality, TiDB 5.0 supports the INTERSECT and EXCEPT operators, List and List Column partitioned tables, Utf8mb4_unicode_ci and UTF8_unicode_CI collations are supported for character sets and collations.

conclusion

As an enterprise-level open source distributed database, TiDB 5.0 makes great progress in performance, stability, ease of use, high availability and security compliance, and adds several enterprise-level features to realize the need for real-time stack data analysis on the basis of OLTP Scale. Will drive enterprise users to accelerate digital transformation and upgrading with enhanced HTAP capabilities.

Download it today!

5.0 Download TiDB

TiDB 5.0 GA Release Notes

Special thanks to all the TiDB developer community and user community for their contribution to the development of TiDB. Since the release of TiDB 4.0, a total of 538 Contributors have submitted 12,513 PR to help us complete the milestone version of the enterprise core scenario. How can TiDB iterate and evolve in the best scenes of this era? We believe this is possible only with the ultimate openness: open source, open communities, open ecosystems and open minds!