Review, starting with best practices
As an HTAP database, TiDB can handle both OLTP online services and OLAP analysis services from clients. For analysis class requirements, the optimizer automatically routes requests to stored TiFlash nodes. For online requests, the optimizer automatically routes the in-store TiKV request. For HTAP database, we are most concerned about whether the analysis queries that consume a lot of resources will affect the online OLTP service. To solve this problem, TiDB isolates TiKV and TiFlash on the physical layer to avoid this situation.
Reflection, best practice structure pain
By means of resource isolation, we address the interaction between businesses. However, in order to achieve more flexible and efficient applications, there are still some problems in the above architecture:
-
HAProxy does not have high availability at the moment
-
For the TiDB Server, TP services are not isolated from AP services
For the above two problems, we can adopt the following two solutions to avoid the above risks.
High availability solution of HAProxy
In a production environment, we generally don’t use HaProxy alone to do DNSRoundRobin. Any single point of non-high availability structure will result in system unavailability. HaProxy itself is a stateless service. For stateless service, we can use multiple services to avoid the availability risk of a single node. Additionally, on Top of HaProxy, viPs can be floated to an available node via Keepalived probe scripts to complete a single entry high availability structure.
Isolation scheme between TP and AP
In the HTAP scenario, we have achieved true isolation at the storage engine level by physically storing data in TiKV and TiFlash to isolate OLTP and OLAP query requests. On computing engines, engine isolation can also be achieved by setting the isolation-read parameter at the TiDB instance level. Configure the variable-read variable to specify that all queries use a copy of the specified engine, which can be “TiKV”, “TiDB”, or “TiFlash” (where “TiDB” stands for the memory table area inside TiDB, Mainly used to store some TiDB system tables, users can not actively use).
HAProxy cannot determine the isolation-read engine isolation mechanism of TiDB Server during roundrobin endpoint, regardless of whether HAProxy is high available on the front-end. This could lead to an awkward situation where HAProxy could route OLTP query requests to a node with Isolation-Read set to TiFlash, preventing us from handling the requests optimally. Alternatively, some analysis class queries that we force to hint away TiFlash may be routed to a node with Isolation-read set to TiKV, and the SQL request throws an exception. From a functional point of view, we need to redefine the HTAP database:
-
I hope that data at the storage layer is separated and OLTP and OLAP services do not affect each other
-
I want the requests at the computing layer to be separate, with OLTP and OLAP requests not affecting each other
Change, requirements drive architectural transformation
Transformation based on HAProxy
To solve the routing of TiDB Server at the computing layer, we can use two sets of HAProxy to physically separate the TiDB Server cluster. One HAProxy cluster is used to manage isolation-read TiDB Server as TiKV, and another HAProxy cluster is used to manage Isolation-read TiDB Server as Tiflash. For high availability purposes, we still need to do high availability on the HaProxy cluster, which abstracts the following architecture:
From the perspective of the overall architecture, such a set of architecture design basically meets our needs. TiDB Server on the computing layer is physically isolated, and the front-end Proxy is also highly available. But there are drawbacks to this structure:
-
The structure is complex, so that in order to ensure the high availability of the system, the cost of the relative physical node is high
-
The export of Proxy is not unified, and two sets of Keepalived are needed to maintain two VIPs, which need to be coded in the business logic
If we adopt such a set of architecture, from the point of view of cutting costs, we can carry out node mixing. We can consider deploying two sets of Keepalived clusters on a three-node machine, physically isolated by virtual_Router_id. Or simply deploy a set of Keepalived clusters, without using keepalived viPs, deploy two sets of VRRP scripts in a set of Keepalived, and maintain separate VIPs in their respective probe scripts. HAProxy we can also deploy keepalived machines as a 2 * (3 * Keepalived + 3 * HAProxy) structure. This improved clustering architecture, while it can compress the machine cost to the same level as maintaining a normal cluster, still does not reduce the architectural complexity or change the immobility of the two portals.
Use ProxySQL to implement SQL routing
Now, what we need is a TP/AP separated Proxy. In terms of requirements, it matches MySQL read-write separation, or specifically, we need a tool for SQL routing. ProxySQL is a product that those of you who are familiar with MySQL know about. ProxySQL is an open source middleware product based on MySQL and is a flexible MySQL proxy tool. As a powerful rule engine middleware, ProxySQL provides a number of features:
-
Flexible and powerful SQL routing rules, can intelligently load SQL requests.
-
Stateless service, convenient high availability management solution.
-
Automatically perceive the monitoring status of nodes and eliminate abnormal nodes quickly.
-
Convenient SQL monitoring analysis statistics.
-
Configuration libraries are based on SQLite storage and can be modified online and loaded dynamically.
-
More flexible than MySQL Query cache, you can control the statement cache in multiple dimensions in the configuration table.
In my opinion, ProxySQL is so powerful that it doesn’t need any extra functionality. Every feature of ProxySQL hits the user’s pain point and meets the user’s needs. If anything, I can think of performance degradation due to routing, but this degradation is still present or worse in other Proxy tools. Is TiDB a good fit for ProxySQL as a “large size” MySQL? The answer is yes. We can simply copy ProxySQL in the MySQL read-write separation scheme for routing TP/AP SQL requests. Even so, all the powerful functions described above are still applicable in TiDB, which, to some extent, makes up for the deficiency of TiDB ecology.
Full link high availability
For a database system, any link may become the point of failure, so any service can not exist in the form of a single point. Any component of the TiDB Cluster is highly available and extensible. ProxySQL can also be configured with a high availability cluster. There are two main popular solutions for high availability in ProxySQL:
-
Multiple independent ProxySQL
-
A high availability cluster using ProxySQL
ProxySQL is itself a stateless service, so having multiple independent ProxySQL at the front is itself a guarantee of availability. However, because multiple ProxySQL are independent, the related configuration files cannot be interconnected. Changes to any configuration cannot be automatically synchronized, which is risky for management. If the clustered version of ProxySQL is highly available, the watchdog process to ensure cluster status may itself be a load on the cluster.
As mentioned earlier, for a set of clusters, we expect to have an entry point. Front-end ProxySQL itself has multiple entries. We can use Keepalived + Haproxy for an endpoint load balancing, or worry about performance degradation caused by multilevel haproxy (ProxySQL). We can maintain Keepalived live scripts ourselves to control VIP. For companies with strict network monitoring, if VRRP is disabled, you can register and discover the Zookeeper service to maintain ProxySQL status and manage viPs in Zookeeper.
Each company has its own solution architecture for unified entry high availability solutions with multiple endpoints. As far as I am concerned, I prefer to use ZooKeeper to manage cluster state rather than Keepalived + HAProxy or load balancing in Keepalived scripts. We need to carefully plan the Keepalived rating system. To reduce the performance degradation of HAProxy, we may need to manage another SET of VIPs in the script or turn off Keepalived services on failed nodes. Of course, the solution we use will have to fit our own technology stack, and only what works for us is the best practice.
In the above architecture, TP and AP requests are connected to the background TiDB Cluster through APP program. As the only entry to the program, Keepalived’s prover selects an available ProxySQL and creates a VIP on top of it. This VIP will act as the only entry point for the application to connect to the TiDB Cluster. In ProxySQL clusters, according to the PATTERN configuration of TP and AP configured in the Router Table (mysQL_query_rules), Configure automatic routes for TP and AP query requests in TP_GROUP and AP_GROUP. To sum up, such an architecture solves our previous pain points:
-
The application uses a unique interface to the database cluster.
-
Simple high availability architecture with a set of Keepalived and a set of Proxy clusters to achieve high availability.
-
TP/AP requests can be automatically routed to the TiDB Server on the corresponding compute node.
Practice, from the case to get results
A demo system was deployed to simply show the running process and results of the entire architecture.
The following is a list of components on the node:
Install TiDB
For details, please refer to the official documentation (TiDB Quick Start Guide).
Install ProxySQL
You can install ProxySQL using RPM instead. However, in order to unify the installation location, I will generally use the source code to compile and install, and then use rPMbuild into the installation package deployment to other nodes. For compilation and installation, refer to the install.md documentation. CFG file and change datadir to “/opt/tidb-c1/ proxysQL-6033 /data”. Use the following command to start ProxySQL or configure the systemd file. /opt/tidb-c1/proxysql-6033/proxysql -c /opt/tidb-c1/proxysql-6033/proxysql.cfg
Configuration ProxySQL
Since I’m using three separate ProxySQL for high availability load in this example, I need to do the same configuration on all three machines. If you choose the high availability inherent in ProxySQL, you only need to configure it on one machine.
[root@r31 proxysqL-6033]# mysql-uadmin -padmin -h127.0.0.1 -p6032 --prompt 'admin>'## set server infoinsert into Mysql_servers (hostgroup_id, the hostname and port) values (10, '192.168.232.31, 14000); Insert into mysql_servers (hostgroup_id, the hostname and port) values (10, '192.168.232.32, 14000); Insert into mysql_servers (hostgroup_id, the hostname and port) values (10, '192.168.232.33, 14000); Insert into mysql_servers (hostgroup_id, the hostname and port) values (20, '192.168.232.34, 14000); Insert into mysql_servers (hostgroup_id, the hostname and port) values (20, '192.168.232.35, 14000); load mysql servers to runtime; save mysql servers to disk; ## set userinsert into mysql_users(username,password,default_hostgroup) values('root','mysql',10); load mysql users to runtime; save mysql users to disk; ## set monitoring userset mysql-monitor_username='monitor'; set mysql-monitor_password='monitor'; load mysql variables to runtime; save mysql variables to disk; ## set sql router rule## this is just a demoinsert into Mysql_query_rules (rule_id, active, match_digest destination_hostgroup, apply) values (1, 1, '^ the select. * tikv. *, 10, 1), (2, 1,' ^ sele Ct. * tiflash. * ', 20, 1); load mysql query rules to runtime; save mysql query rules to disk;Copy the code
Configuration Keepalived
Keeaplived-install is the same as ProxySQL. It is recommended to build and install Keepalived into an RPM package or directly copy Keepalived into binary. The configuration file script for Keepalived is as follows:
global_defs { notification_email { [email protected] [email protected] [email protected] } vrrp_skip_check_adv_addr vrrp_strict vrrp_garp_interval 0 vrrp_gna_interval 0}vrrp_script check_proxysql { script 'killall -0 proxysql || systemctl stop keepalived' interval 2 weight 10}vrrp_script test_script { script 'echo `date` >> /tmp/aaa' interval 1 weight 1}vrrp_instance proxysql_kp { state MASTER interface ens33 virtual_router_id 51 priority 150 advert_int 1 authentication {auth_type PASS auth_pass 1888} virtual_ipaddress {192.168.232.88} track_script{ check_proxysql ##test_script }}Copy the code
Verify ProxySQL
I opened the General Log on five TiDB servers to record SQL statements. Two tables are created in the TiDB Cluster:
-
Test.t_tikv (idi int), data range from 1 to 1000
-
Test.t_tiflash (idi int), the value ranges from 1 to 1000
Pressure test using a simple cycle at the front end:
for i in `seq 1000`; Do mysql -uroot -p6033 -h192.168.232.88 -pmysql -e "select * from test.t_tikv where idi = $I "; donefor i in `seq 1000`; Do mysql -uroot -p6033 -h192.168.232.88 -pmysql -e "select * from test.t_tiflash where idi = $I "; doneCopy the code
TiDB Server log Number of filtering keywords select * from test.t_tikv where idi =. It can be seen that according to the TiKV SQL configured in the routing table, 1000 scattered routes were routed to tiDB-1, TIDB-2 and TIDB-3 nodes.
Easter egg, the audit feature you want
Database audit is a system which supervises the access behavior of database. It can provide the basis for determining the responsibility of database security incident. Extracting audit logs into real-time data warehouse for risk control processing can timely discover risks and recover losses to the greatest extent. Audit logs are essential in some important financial and order-trading systems.
How do I capture the Audit log
Like many users today, I have also met the demand of Audit. Many open source databases like MongoDB do not provide free auditing. The audit function is particularly important for many financial scenarios. To complete the audit function, we usually have two ways:
-
Parsing semantics in source code
-
Traffic collection of data
The so-called semantic parsing of source code is actually the function of manually adding audit in the source code. We can modify the source code to drop some of the variable information we want to capture into a local file. However, this approach can cause a lot of waiting and affect database performance. By performing this write asynchronously, you can mitigate the performance degradation slightly. Another way of data flow collection, compared to variable drop disk this way, is slightly better. The idea of traffic monitoring is to set up a bypass system that is independent from the database. Traffic is intercepted using tools such as packet capture or probe, and requests to the database are printed to local files. This method itself is not linked to the database and obtains audit logs asynchronously.
Capture the Audit log in TiDB
At present, there are two main types of audit on TiDB. One is to purchase the audit plug-in provided by the original factory, and the other is to enable the General Log function. SQL statements can be viewed in TiDB log. Note that since we are using HAProxy on the front end, we need to configure the forwardfor parameter to capture the client’S IP. The General Log logs all requests, including select requests, in the TiDB log, resulting in a performance penalty of about 10%-20% based on past tests. Recorded SQL statements including time or IP and other information may not meet our needs, and sorting out audit from TiDB log is also a big project.
Obtain the audit log in ProxySQL
The requirement for Audit is very common. Like MongoDB, it is common for community versions of open source databases not to provide audit. From the point of view of the whole link, there are two nodes that can obtain the complete audit, one is the database side and the other is the Proxy side. ProxySQL provides auditing capabilities. If the MySQL_enventslog_filename parameter is specified, the audit function is enabled. In the audit file, we can capture all SQL audits for the ProxySQL entry. In my environment, I can capture the following audit log format, which basically meets most of the user requirements:
Audit was intercepted by the probe
You can use SystemTap to make a probe and hang it on proxySQL based on some proxySQL keywords, for example, Run, execute, Query, init, parse, MySQL, Connection, etc. try to trace the call stack and parameters of these functions. Print these parameters to obtain the IP address and Statement to process the request. If there is no way for these functions to trace the audit information, consider brute force tracing all of ProxySQL’s functions (matched by function(“*”)). Locate to the specified function based on the result. But this approach requires a more powerful server for development.
The audit logs that can be traced through Unreal are as follows:
>>>>>>>>>>>>>>>>>>>[ function >> ZN10Query_Info25query_parser_command_typeEv ] [ time >> 1622953221 ] this={.QueryParserArgs={.buf="select ?", .digest=2164311325566300770, .digest_total=17115818073721422293, .digest_text="select ?", .first_comment=0x0, .query_prefix=0x0}, .sess=0x7f1961a3a300, .QueryPointer="select 1113 192.168.232.36", .start_time=2329486915, .end_time=2329486535, .mysql_stmt=0x0, .stmt_meta=0x0, .stmt_global_id=0, .stmt_info=0x0, .QueryLength=11, .MyComQueryCmd=54, .bool_is_select_NOT_for_update=0, .bool_is_select_NOT_for_update_computed=0, .have_affected_rows=0, .affected_rows=0, .rows_s ######
Copy the code
Query Point can be captured, query text can be obtained from it, the client IP of the user, and function name and time are directly written by the script of systemTap.
The plug-in probe can reduce the log writing wait of Proxy or Database, minimize the impact on Database performance, and basically ignore the performance loss caused by auditing.