On November 30, 2018, TiDB released 2.1 GA version. Compared with version 2.0, this version has many improvements in system stability, performance, compatibility, and ease of use.

TiDB

The SQL optimizer

  • To optimize theIndex JoinSelect a range to improve performance
  • To optimize theIndex JoinAppearance selection, using tables with fewer estimated rows as appearance
  • Expand the Join HintTIDB_SMJThe Merge Join can also be used when no suitable index is available
  • To strengthen the Join HintTIDB_INLJThe ability to specify the inner table in a Join
  • By optimizing associated subqueries, including pushing down Filter and expanding index selection range, the efficiency of some queries has been improved by orders of magnitude
  • Support inUPDATEDELETEIndex Hint and Join Hint are used in the statement
  • Supports more functions to push down:ABS/CEIL/FLOOR/IS TRUE/IS FALSE
  • Optimize built-in functionsIFIFNULLConstant folding algorithm based on
  • To optimize theEXPLAINStatement output format, using a hierarchical structure to represent upstream and downstream relationships between operators

SQL execution engine

  • Refactoring all aggregate functions and upgradingStreamHashThe execution efficiency of aggregation operator
  • To realize the parallelHash AggregateOperator, 350% performance improvement in some scenarios
  • To realize the parallelProjectOperator, some scenes have 74% performance improvement
  • Concurrent readingHash JoinInnerTable andOuterTable data to improve execution performance
  • To optimize theREPLACE INTOStatement execution speed, performance increased by 10x
  • Optimize the memory usage of time types, reducing the memory usage of time type data by half
  • Optimized query performance of point query and improved Sysbench point query efficiency by 60%
  • TiDB inserts and updates wide tables for a performance increase of nearly 20 times
  • You can set the upper limit of memory usage for a single query in the configuration file
  • To optimize theHash JoinWhen Join type isInner JoinorSemi JoinIf the inner table is empty, the outer data is not read and the result is quickly returned
  • The EXPLAIN ANALYZE statement is supported to view runtime statistics such as the running time of each operator during Query execution and the number of rows returned as a result

statistics

  • Enable the automatic ANALYZE function for statistics only in a certain period of time in a day

  • Table statistics are automatically updated based on the query feedback

  • The ANALYZE TABLE WITH BUCKETS statement can be used to configure the number of BUCKETS in the histogram

  • Optimize an algorithm for estimating Row Count using histograms in a mixture of equivalent queries and range queries

expression

  • Support for built-in functions:
    • json_contains
    • json_contains_path
    • encode/decode

Server

  • Supports queuing of conflicting transactions within a single TIDB-Server instance to optimize performance in scenarios with frequent inter-transaction conflicts
  • Supports Server Side Cursor
  • newHTTP Management Interface
    • Regions of table were scattered in TiKV cluster
    • Control whether to opengeneral log
    • Changing a log level online
    • Example Query TiDB cluster information
  • addauto_analyze_ratioThe system variable controls the automatic Analyze threshold
  • addtidb_retry_limitThe system variable controls the number of automatic transaction retries
  • addtidb_disable_txn_auto_retrySystem variables control whether transactions are automatically retried
  • Support the use ofadmin show slowStatement to get the slow query statement
  • Adding environment variablestidb_slow_log_thresholdDynamically set the slow log threshold
  • Adding environment variablestidb_query_log_max_lenDynamically sets the length of the original SQL statement truncated in the log

DDL

  • Add Index statements can be executed in parallel with other DDL statements, preventing time-consuming Add Index operations from blocking other operations
  • To optimize theAdd IndexIn some cases, the speed is greatly increased
  • supportselect tidb_is_ddl_owner()Statement to determine whether TiDB isDDL Owner
  • supportALTER TABLE FORCEgrammar
  • supportALTER TABLE RENAME KEY TOgrammar
  • Admin Show DDL JobsAdd table names and library names to the output
  • Support the use ofddl/owner/resignThe HTTP interface releases DDL Owner and opens a new DDL Owner election

compatibility

  • Support for more MySQL syntax
  • BITAggregate function supportALLparameter
  • supportSHOW PRIVILEGESstatements
  • supportLOAD DATAThe statementCHARACTER SETgrammar
  • supportCREATE USERThe statementIDENTIFIED WITHgrammar
  • supportLOAD DATA IGNORE LINESstatements
  • Show ProcessListStatement returns more accurate information

PD

Usability optimization

  • TiKV version control mechanism is introduced to support cluster rolling compatible upgrade
  • Raft PreVote is enabled between PD nodes to avoid re-elections after network isolation is restored
  • openraft learnerReduces the risk of data unavailability due to downtime during scheduling
  • TSO allocation is no longer affected by system time rollback
  • supportRegion mergeFeatures to reduce metadata overhead

Scheduler optimization

  • Optimized the processing process of Down Store to speed up the replication after downtime
  • The hotspot scheduler is optimized for better adaptability when traffic statistics jitter
  • Optimize the startup of coordinators to reduce unnecessary scheduling caused by PD restart
  • Optimized the problem that the Balance Scheduler frequently schedules small regions
  • Optimize Region merge to consider the number of data rows in a Region
  • Added some switches to control scheduling policies
  • Improve the scheduling simulator and add the scheduling scene simulation

API and operation and maintenance tools

  • Added the GetPrevRegion interface to support the TiDB Reverse Scan function
  • The BatchSplitRegion interface is added to support rapid TiKV Region splitting
  • GCSafePoint interface is added to support TiDB concurrent distributed GC
  • GetAllStores interface is added to support TiDB concurrent distributed GC
  • Pd – CTL feature:
    • Use statistics for Region split
    • calljqFormat the JSON output
    • Query Region information of the specified store
    • Query the topN Region list sorted by version
    • Query the topN Region list sorted by size
    • More accurate TSO decoding
  • Pd-recover no longer needs to provide the max-Replica parameter

monitoring

  • increaseFilterRelated monitoring
  • Added etCD Raft state machine monitoring

Performance optimization

  • Optimize the performance of Region heartbeat processing to reduce memory overhead caused by heartbeat
  • Optimize Region Tree performance
  • Optimize performance issues for computing hotspot statistics

TiKV

Coprocessor

  • New support for a large number of built-in functions
  • Added the Coprocessor ReadPool to improve the concurrency of request processing
  • Fixed time function parsing and time zone related issues
  • Optimize memory usage for push-down aggregation calculations

Transaction

  • Optimized MVCC read logic and memory efficiency, improved the performance of scan operation, and doubled the performance of Count table compared with version 2.0
  • Folds successive Rollback records in the MVCC to ensure record read performance
  • newUnsafeDestroyRangeThe API is used to quickly reclaim space in the case of drop table/index
  • The GC module is isolated to reduce the impact on normal writing
  • The kv_scan command supports upper bound

Raftstore

  • Optimize the Snapshot file writing process to avoid causing RocksDB stall
  • Add LocalReader threads to handle read requests, reducing read request latency
  • supportBatchSplitAvoid large regions due to massive writes
  • Supports Region Split based on statistics to reduce I/O overhead
  • Supports Region Split based on the number of keys to improve the concurrency of index scanning
  • Optimize part of the Raft message processing process to avoid unnecessary delays caused by Region Split
  • To enable thePreVoteFunction to reduce the impact of network isolation on services

The storage engine

  • Repair RocksDBCompactFilesThe bug may affect the data imported by Lightning
  • Upgrade RocksDB to V5.15 to resolve the issue that snapshot files may be written bad
  • To optimize theIngestExternalFileTo avoid flush stuck writes

tikv-ctl

  • The LDB command is added to troubleshoot RocksDB problems
  • The compact command allows you to specify whether to compact data at the Bottommost layer

Tools

  • Full data fast import tool tiDB-Lightning
  • Supports the new version tidb-binlog

Upgrade Compatibility

  • Because the storage engine of the latest version is updated, you cannot roll back to 2.0.x or an earlier version after the upgrade
  • The new version is enabled by defaultraft learnerFunction: If the cluster is upgraded from 1.x version to 2.1 version, stop the upgrade or upgrade TiKV first, and then upgrade PD after completion
  • Before upgrading from a version prior to 2.0.6 to 2.1.0, it is a good idea to check whether there are ongoing DDL operations in the cluster, especially time-consuming Add Index operations
  • Because 2.1 enables parallel DDL, rolling to 2.1 is not possible for clusters earlier than 2.0.1. You can choose the following two solutions:
    • Downtime to upgrade directly from TiDB versions earlier than 2.0.1 to 2.1
    • Roll to 2.0.1 or later 2.0.x and then roll to 2.1