About the author: Jasonys, affiliated to the Data Platform Department of Tencent Technology Engineering Business Group, is responsible for the technical research and development and architecture design of TBase data. With more than 10 years of experience in database kernel development and design, jasonys has completed the architecture design and development of various databases.

In 2017, PGXZ changed its name to TBase and officially released TBase V1 in the form of press conference. Through the efforts of the team members, TBase V1 has so far been used by more than a dozen customers in the external market of the company, including government affairs, public security, fire fighting, telecommunications, finance and other industries. TBase is widely recognized by customers for its powerful function, stable operation, and powerful Internet gene. In the contact with external customers, I have heard a lot of praise, and there are also many expectations that TBase can solve more pain points of customers.

In 2017, the TBase team planned TBase V2 on the basis of PostgreSQL’s latest version V10.0 after a long time of planning and discussion. They hoped to represent the latest and strongest database technology of Tencent to meet the needs of customers. Solve the pain points in customer business operation.

In the following 8 months, Shuping partners worked day and night to finally complete the development of TBase V2 core. This version of which has accumulated hundreds of days and nights of hard work of Shuping partners will finally meet you. I am very glad to introduce the core concepts and architecture of TBase V2 to students here.

TBase V2 Core concepts

Important technical features and concepts of TBase V2 mainly include the following aspects:


Enterprise: Enterprise features include:

  • User friendly transaction characteristics: business does not need to pay attention to the transaction characteristics of the database, the database kernel supports a complete distributed transaction, to ensure the ACID of transactions.
  • User friendly database features: primary keys, foreign keys, sequences, constraints, partitioned tables, stored procedures, triggers, subqueries and other enterprise-class features complete support.
  • User friendly SQL interface: TBase V2 is compatible with SQL2003 standard and common ORACLE syntax, which facilitates the migration of ORACLE deep users. Currently, there are cases of ORACLE migration outside.
  • User friendly distributed query ability: good distributed query support ability, database kernel can efficiently process distributed JOIN.

NewSQL database: Compared with traditional database products, TBase can achieve efficient online linear expansion and does not affect service running when the cluster scale changes.

HTAP capabilities: Hybrid Transactional/Analytical Processing, a Hybrid Transactional and Analytical Processing technology that requires two business types with conflicting resource demands to be processed in the same database. TBase V2 is specially designed to achieve the perfect HTAP, with efficient OLAP capability and massive OLTP processing capacity.

The following is the benchmark test result of TPCC, the transaction test model we tested. The system completed more than 3.1 million transactions per minute, and the test of larger cluster is still in progress. According to the current architecture design, the transaction throughput of the system will increase quasi-linearly with the cluster size if the hardware allows:


The following chart shows TBase in row storage mode versus the industry MPP data warehouse benchmark in OLAP test collection TPCH 1Tbenchmark:


The OLAP analysis ability of TBase can be intuitively seen from this figure. The time consumption of each of TBase’s 22 use cases is better than that of our rival products, and the time consumption of some use cases is significantly higher than that of our rival products.

With HTAP technology, businesses can process BOTH OLTP-class transactions and OLAP class analysis in a single TBase cluster. HTAP can greatly reduce the complexity of service systems and operation and maintenance costs.

High data security:

In the process of communication with customers, customers in various industries have raised the demand of data security, TBase team combined with customer needs and the industry’s advanced database security solutions to design TBase V2 data security system. This system mainly includes the following aspects:

  • Separation of three powers: the role of database system DBA is decomposed into three mutually independent roles, security administrator, audit administrator and data administrator, which restrict each other, eliminate the god authority in the system, and solve the data security problem from the system role design.
  • Mandatory safety rules: Combined with the advanced database security solutions, TBase V2 solutions proposed mandatory safety rules, through the force of the development of the security administrator safety rules, but also do row-level visible and column level, thus limit users see the data, for different user privilege do hybrid control, effectively prevent unauthorized to view data, Ensure the security of critical data.
  • Transparent data desensitization management: for financial, security and other industries with special requirements on data security, there are often demands for data desensitization, but many existing solutions require business participation, requiring in-depth business participation, with a certain threshold. TBase has specially designed for this pain point to achieve transparent desensitization of services. Businesses only need to design business logic according to their own business rules combined with TBase’s desensitization syntax. TBase internal data desensitization can be done, combined with the above mentioned mandatory security rules, security administrators can specify data desensitization for users, ultimately achieve high security level users see non-desensitized data, low security level users see desensitized data.
  • Audit ability: in the process of communication with customers, many customers would talk about the database audit, TBase V2 in the process of design, based on the industry benchmark audit standard design audit system, in the kernel to realize the core functions of the audit, do the audit granularity of both preciseness and can guarantee the performance of the system. At the same time, special solutions are designed for some problems encountered in the business to achieve real-time notification of audit results.

Multi-tenant capability:

TBase provides multi-tenant capabilities at both the cluster level and the cluster user level. The multi-tenant capability at the cluster level helps businesses quickly establish a database private cloud and customers quickly provide TBase-based DRDS services. The multi-tenant capability architecture at the cluster level is as follows:


In addition, TBase database cluster also provides a multi-tenant solution based on node group within the cluster to isolate services and resources within the database cluster. Multiple services run in isolation within TBase. In the following figure, APP1, APP2, and APP3 run in a database cluster and are isolated from each other by group.


TBase product architecture

The technical features of TBase V2 are generally introduced above. For those who are unfamiliar with TBase for the first time, the overall architecture of TBase is introduced here to facilitate their understanding of the following content. V1 and V2 are similar in overall architecture:


There are three node types in a cluster, each of which performs different functions and becomes a system through network connections. The three node types are as follows:

  • Coordinator: Coordinates nodes, provides external interfaces, and is responsible for data distribution and query planning. Multiple nodes are located equally and each node provides the same database attempt. CN stores only global metadata of the system, not actual service data.
  • Datanode: Processes and stores metadata related to this node. Each node also stores a data fragment. Functionally, THE DN node is responsible for executing the execution request distributed by the coordination node.
  • GTM: the Global transactionmanager (GTM), which manages the transaction information of the cluster and manages the Global objects of the cluster, such as sequences.

With the above architecture, TBase provides a database cluster with friendly interfaces. This database cluster architecture has the following advantages:

  • Write-scalable: It is possible to deploy multiple CNNS and issue writes to these nodes simultaneously.
  • Multi-master: Each CN node in the system can initiate write operations and provide unified, complete and consistent database views.
  • Synchronous data: For the service, a write on one CN node is immediately displayed on the other CN nodes;
  • Transparent data: Although index data exists in different DN nodes, when services query databases through CN, they can write SQL statements just like ordinary databases. There is no need to care about the specific node where the data is located. The TBase database kernel automatically completes the scheduling execution of SQL and ensures transaction characteristics.

TBase V2 features

V2 OLAP capability improvement:

When it comes to the improvement of OLAP capability, the differences between TBaseV1 and V2 in processing OLAP request are firstly discussed. The differences between V1 and V2 in processing OLAP request mainly include execution mode and whether DN nodes communicate with each other:

Execution mode: When executing OLAP requests in V1, CN sends SQL statements to DN. DN plans and executes SQL statements and reports the results to CN. CN summarizes the results. In V2, CN collects cluster statistics, plans a cluster-level distributed query plan for OLAP queries, and sends the query plan to each DN for execution. That is, CN delivers the execution plan, and THE DN is only responsible for execution.

Whether to exchange data between DNS: In V1, there is no communication channel between DNS and data cannot be exchanged. V2 version establishes an efficient data exchange channel between DN nodes, which can efficiently exchange data between DN nodes.

The differences are shown in the figure below:


On the basis of V2 OLAP framework, we developed a complete and efficient multi-threaded data transmission mechanism. When running OLAP query, this framework ensures that data can be efficiently synchronized between nodes, greatly improving THE EFFICIENCY of OLAP processing.

At the algorithm level, based on the multi-core parallel execution capability of PG10, we systematically redesigned the commonly used JOIN and AGGRATE algorithms in the cluster environment, which gave full play to the performance of the existing hardware. At the same cluster scale, OLAP Benchmark TPCH 1T was tested, and the row storage model was used. The average performance is 2 to 5 times higher than the industry benchmark.

OLTP capability optimization and improvement:

GTM is a module that deals with transaction information in TBase cluster. Its processing capacity directly determines the transaction throughput of the system. Moreover, GTM is the only single point in the system, and its processing ceiling directly affects the ceiling of the system’s processing capacity.

We specifically optimized and designed GTM for this purpose. It mainly focuses on the following four aspects:

  • Network bandwidth optimization: cancel the cluster snapshot of the system and replace it with a logical clock to judge the cluster visibility of transactions, which greatly reduces the occupation of NETWORK bandwidth for GTM and CPU usage for GTM.
  • The OPTIMIZATION of CPU usage greatly reduces the thread data of GTM, reduces the CPU usage of system scheduling and greatly improves the processing efficiency of GTM through the way of thread resource reuse.
  • Optimization of system lock: when the system throughput reaches millions, the system mutex used by GTM originally occupies most OF the CPU. We write the mutex in user mode, which makes the CPU usage only one tenth of the original and improves the upper limit of the system’s processing capacity.
  • Lock-free queues are used to replace the original queues with locks, reducing the use of locks and greatly improving the processing efficiency of the system.

In addition, we also propose a patented distributed transaction consistency technology to ensure transaction consistency in a fully distributed environment. Through the above optimization, the TPCC transaction processing capacity of TBase cluster is greatly improved, and the processing capacity will increase quasi-linearly with the cluster size. The following is the TPCC test result under the scale of 60 clusters. The maximum throughput reaches 310W per minute. At this time, the system DN and CN resources are tight, but there are still quite a lot of GTM resources.


TBase V2 HTAP processing capacity:

Before we talk about TBase’s HTAP capabilities, let’s analyze the HTAP capabilities of the current mainstream distributed database architectures on the market. We won’t talk about all-in-one solutions such as Exadata and HANA.

First of all, sharding distributed architecture is common in Internet companies:


This architecture uses middleware to provide unified database access interface through physical sub-database sub-table to achieve the effect of distributed database. This architecture has certain competitiveness in dealing with simple SQL requests, but it is often unable to deal with complex SQL such as complex distributed join and subquery, so it is not good at HTAP business processing.

The second architecture, the classic MPP architecture, is PivotalGreenplum, which has only one Master node and uses the Master data node to provide query services. This architecture is derived from OLAP. Due to the single Master problem, the system’s processing capacity is limited by the Master’s. Moreover, the minimum granularity of system lock is table-level, which directly affects the transaction processing ability. This architecture is only suitable for processing OLAP services, not OLTP services.


The architecture of ShareNothing is described above, and the architecture of Share Everything is analyzed as follows:


Typical products are Sybase IQ and Oracle RAC. Sybase IQ, as a classic data warehouse product, was once popular. Many concepts and solutions of data warehouse can be found in Sybase IQ. Since Sybase IQ was designed as a data warehouse product at the beginning of the design, transaction requests are not considered in the architecture, and only OLAP requests can be processed.

For ORACLE RAC, as a popular database product, it has a good performance in handling OLAP and OLTP requests, but it is complained by many customers because of the expensive price of the hardware and the complex and lengthy process of expansion.

After analyzing the solutions in the industry, we basically came to a conclusion that we need a HTAP distributed solution that can efficiently process OLTP and OLAP services at the same time, and give consideration to ease of use and low cost. The existing solution is difficult to meet our needs. In addition to the basic ability and there is a need to pay attention to the problem is, OLTP class request on time delay and throughput, delay and OLAP attention, because of concerns in different resource using the model of the two completely different, therefore how to within the same cluster effectively deal with these two kinds of business and at the same time very good resource isolation become a thorny problem.

The TBase team carefully designed the TBase HTAP solution after comprehensively considering the above factors. The overall architecture is as follows:


TBase divides HTAP into two scenarios:

In CASE 1, OLAP and OLTP access different service data. In this scenario, we can use TBase’s group isolation technology to enable TBase to use efficient OLAP and massive OLTP capabilities respectively on the basis of natural physical isolation.

In CASE 2, WHEN OLAP and OLTP access the same data, both OLTP and OLAP operations must be performed on the same data and both operations must be efficient. To achieve resource isolation, TBase uses THE DN host to run OLTP services and uses special OLAP standby DN nodes to run OLAP services to achieve natural resource isolation.

TBase security architecture

TBase team in the process of communication with customers, customers in many industries have raised demands for data security, TBase for business pain points and combined with the database industry leading database concept design TBase security system.


TBase data security system is based on the separation of the three rights of database, which divides the authority of traditional DBA into security administrator, audit administrator and data administrator. In the field of security rules, security rules and data transparency desensitization rules are added for security administrators. In the field of audit, object audit, user audit, and fine-grained audit are added according to industry audit standards and business scenarios. In addition, the data administrator performs the data management and database operation and maintenance functions of the former DBA.

The following is an introduction to these parts.

Security Architecture –TBase Separation of Powers:

The security system of TBase is divided into the following levels. Firstly, the system roles are decomposed, including the much-criticized DBA super authority, which is divided into security administrator, audit administrator and data administrator. This practice is called “separation of three powers” in the industry, as shown in the figure below:


These three roles correspond to the separation of powers in the U.S. political system:

Security Administrator (Legislative) :

• Enforce secure access policies.

• Enforce secure access policies to be done independently by security officers.

• All users of the system must comply with mandatory security access policies.

Audit Administrator (Jurisdiction) :

• All system operations are recorded, including those of security officers and administrators.

• Audit policies are independently formulated by the audit administrator.

• The auditor’s own operations are also forcibly recorded and cannot be modified.

Data Manager (Executive Authority) :

• Have independent access control.

• Do not interfere with the actions of auditors and safety officers

The division of the three roles fundamentally eliminates security dead spots in the system. The security administrator is responsible for formulating overall security rules to restrict all roles in the system. The audit administrator is responsible for specifying audit rules and all roles in the audit system, including the audit administrator, so that all operations in the system can be traced. The data manager is responsible for the daily operation and maintenance of the database. The three mutual restraint, mutual supervision.

TBase security architecture – Mandatory access rules:

The mandatory access rule sounds a bit abstract. Here is a concrete example of a company’s employee information:


For the chairman of the company, the safety rule we gave him was to see all the data in the system, so he saw this table that looked like this, complete and unprocessed:


Security rules stipulate that the general manager of the Engineering department can only see the data related to the engineering department’s own employees, because the salary and employee private information are restricted to him due to the high security level data, so the data he sees in this table is like this:


For Zhang SAN, an employee of the engineering department, the system restricts him to see only his own data through access rules, so he can see the data in the system through select like this:


Here Joe can see his own salary data.

TBase provides complete security rules SQL commands to allow security administrators to easily define security rules through which data column and column mixed access control can be achieved.

TBase Security Architecture – Transparent data encryption and desensitization

There are often customers will give us problems related to safety is a problem: data services hosted on a service provider, there some column in the table to see don’t want to let the content of the service provider, but the service provider to their business operations and need to know the table structure, ask we have very good technical means to solve this problem. Another common problem is how to ensure the security of database files if they are dragged away. In view of this kind of demand, we designed TBase transparent encryption and desensitization system to ensure the security of user data.

According to the requirements of these data security classes, TBase designs transparent encryption and transparent desensitization of data. Transparent encryption and transparent desensitization can be used separately or in combination. The differences and principles are as follows:


Transparent encryption: It is used to prevent data leakage after data file leakage. Therefore, encrypted data is stored in files at the storage layer. For upper-layer services to access data, services can read plaintext data.

Transparent desensitization: Transparent desensitization is part of the mandatory access rules, so desensitization rules are established for specific database roles. Authorized users can read and write data in database tables normally, while unauthorized users can only see the result after data desensitization, which effectively prevents unauthorized access.

Transparent encryption + Transparent desensitization: In this mode, ciphertext is stored on disks to prevent data leakage after data files are leaked. In addition, security rules are enforced to prevent unauthorized data access, ensuring data security at both the storage layer and the service layer.

As an example, if the transparent desensitization security rules restrict the salary, personal information, home address and age of four columns to transparently desensitize the data of DBA, then when the DBA queries this table, the data will look like this:


As you can see from the figure, the desensitized data is displayed as the default value of the system. The DBA cannot see the value of this column, and there is no other way to test the true value of this column.

TBase Security Architecture – Audit:

Based on the industry’s advanced solutions and practices, TBase audits are divided into the following categories:

  • SQL audit: Audits specific SQL statements, regardless of specific objects. Only for specific SQL types. The syntax is as follows:

Sql_statement_shortcut is the content of SQL audit. For details, ALL contains the audit entries in the following table, which are basically DDL statements. Below is the list of audit statements.

  • Object audit: Audits the specified SQL type of a specified object, that is, details the audit on the specified object. The action is shown below.

TBase audit uses a unique design architecture, audit operation process, the system performance is very small. Fine-grained audit is flexible. You can set filtering conditions for data. If the data that meets the filtering conditions is accessed, the audit actions are triggered.

TBase audit uses a unique design architecture, audit operation process, the system performance is very small.

TBase V2d multi-tenant capability

As an enterprise-level distributed database, TBase also provides multi-tenant capabilities that enterprises often use, allowing customers to run multiple services in a database environment. TBase is responsible for the isolation of multiple tenants, so that services do not affect each other.

The overall multi-tenant architecture of TBase is as follows:


TBase multi-tenant management is divided into three layers. The resource management layer at the bottom manages basic physical machines, slices, pools and isolates resources, and allocates and releases resources from the upper layer.

The second layer is the tenant management layer, which is responsible for the permission management of tenants. Each tenant has a complete system of separation of rights. Each tenant can correspond to multiple projects, and each project corresponds to a cluster. Each tenant can only see its own associated projects and clusters.

The top layer is system management. This layer is responsible for creating tenants and clusters and managing resource allocation and release of the entire platform. This layer can view the status information of tenants and clusters of the entire platform and physical machines.

In addition to the cluster-level multi-tenant provided by the system, intra-cluster multi-tenant solution capabilities based on node groups are provided within a single TBase cluster. For example:


Within a database cluster, three APP services use different node groups and separate CN to achieve resource isolation, which perfectly realizes multi-tenant within the cluster.

With the preceding two solutions, services can quickly deploy services in a multi-tenant environment based on their requirements.

TBase V2 Elastic capacity expansion online:

For a distributed system, elastic capacity expansion is a rigid demand, and TBase is not vague in this area. In TBase V1, SHARdMap and Shard table are introduced, and TBase can provide linear capacity expansion online through shard table.


In the V1 kernel, Shard records are stored in the order of allocation, and shardid records may not be stored continuously. In order to complete the expansion process, a lot of adaptation is made to the bottom layer, resulting in a lengthy process.

V2 kernel introduces the concept of Shard clustering in the storage layer, that is, the same records of Shardid are continuously stored in the bottom layer, which greatly simplifies the design of upper-layer business processes and greatly improves the efficiency of capacity expansion.


TBase V2 partition table:

In TBaseV1, we introduced clustered partitioned tables, which perform 1-3 orders of magnitude better than community partitioned tables in OLTP scenarios, especially when there are a large number of child tables. The overall structure is as follows:


Community performance comparison:


In this architecture:

Coordinator: A Coordinator is responsible for vertical table partitioning but is not aware of the logic of table partitioning (horizontal table partitioning).

DataNode: Divides a logical table into multiple physical tables based on the sub-table fields.

TBaseV2 also has this excellent built-in partitioned table feature. However, V2 also inherits the community’s RANGE and LIST partitions from PG 10.0, and with the addition of these two partition types, businesses have a richer choice when creating partition tables.

Other new features of TBase V2:

  • Hash indexes support Crash Safe:

PG10 officially introduces the Hashindex XLOG mechanism IN this release, which means that we can safely use Hash indexes and have a better alternative to Btree for equivalent queries and IN operations.

  • Efficient expression manipulation:

In PG10, in order to support JIT, the expression calculation method is changed from the traditional executor to expansion execution, and the expression execution efficiency is greatly improved compared to before. The team did a special comparison. We made a JIT DEMO of the typical expression operation and compared the performance of PG10 kernel expression expansion, and found that PG10 test results were not worse than the performance of JIT-generated programs.

  • Multi-core scalability enhancement:

Transaction scalability: PG has introduced several OLTP transaction pre-enhancement features since 9.6, including XLOG parallel drop disk optimization and system snapshot mechanism optimization. According to our team’s tests, transaction throughput can achieve quasi-linear improvement from 24-core servers to 96-core servers. That is to say, under TBase V2, whether it is a common server with 24 cores or a high-end server with more cores, TBase can give full play to the potential of the equipment and make full use of resources.

Multi-core parallel execution: THE parallel execution framework is introduced in PG9.6, and the parallel optimization of aggregate, Hash Join, Merge, Seqscan, bitmap Scan and other operators can be achieved in PG10. Supported by these parallelism capabilities, TBase V2 is well suited for OLAP analysis of large amounts of data.