Author’s brief introduction
CAI Yueyi, senior Big data R&D Manager of Ctrip Hotel, is responsible for hotel data intelligent platform r&d and big data technology innovation. Like to explore open source technical framework of big data.
The background,
1) Ctrip hotel has thousands of tables every day, with a total of more than one billion data updates. How to ensure high availability of production applications in the process of data updates?
Nearly one million times a day 2) data query request, the user can from coarse-grained state province city summary continuously drill down to the hotel, room size of the data, we often can’t make a further level of vast amounts of detailed data aggregation, a large number of key business data are good hundreds of millions of data correlation permissions, basic information, according to the user scenarios for different dimensions of summary data;
3) In order to enable users to query data in app or PC to provide instant results, we need to continue to explore and study to find the most appropriate technical framework.
In this regard, we have tried relational database, but it is almost impossible for ten-thousand-level table associated database to achieve second output. We have considered Sharding, but the amount of data is large and various costs are very high. Hot data is stored in ElasticSearch, but cannot be associated across indexes. Therefore, you have to create a wide table because of permissions, hotel information changes, so it is not applicable to update large tables, and maintenance costs are high. Redis key pair stores don’t do real-time summarizability, and we’ve tested Presto, GreenPlum, and Kylin, but ClickHouse really gave us pause to explore the ever-expanding usage scenarios.
ClickHouse introduction
ClickHouse is a column database management system for real-time analysis of big data, not a database. Through vectorization execution and the use of CPU underlying instruction set (SIMD), it can process massive data in parallel, thus speeding up the data processing speed.
The main advantages are:
1) For efficient CPU use, data is not only stored as columns, but also processed as vectors;
2) Large data compression space, reducing IO; Processing single query high throughput up to billions of rows per second per server;
3) Index non-B-tree structure, do not need to meet the leftmost principle; As long as the filter criteria are included in the index column; ClickHouse full table scans are fast even when the data in use is not in the index, thanks to various parallel processing mechanisms;
4) Write speed is very fast, 50-200m /s, very suitable for a large number of data updates;
ClickHouse is not a panacea, and because ClickHouse is fast, there is a price to pay for being fast. Selecting ClickHouse requires the following considerations:
1) No transaction support, no real delete/update support;
2) Does not support high concurrency, the official QPS is recommended to be 100, you can modify the configuration file to increase the number of connections, but if the server is good enough;
3) SQL can meet more than 80% of the daily use of syntax, join writing method is special; Sql-like JOINS have been supported in the latest version, but their performance is poor;
4) Try to do more than 1000 batch writes, avoid row by row insert or small batch insert, UPDATE, delete operations, because ClickHouse will be constantly asynchronous data merge, which will affect query performance, this should be avoided when doing real-time data write;
Clickhouse is fast because it uses half the CPU of the server to execute a query. Clickhouse does not support high concurrency. By default, the number of CPU cores used for a single query is half the number of server cores. You can modify this parameter in the configuration file.
ClickHouse’s practice in hotel data intelligent platform
3.1 Data Update
Our primary source of data is Hive to ClickHouse, which we currently use in one of two ways:
1) Hive to MySql and import to ClickHouse
Data import from Hive to ClickHouse was not initially supported at DataX. Instead, we used DataX to import data into mysql, and ClickHouse native APIS to import data from mysql to ClickHouse.
We designed a complete data import process to ensure that data from Hive to mysql to ClickHouse runs automatically and stably, and that data is highly available for online applications during synchronization.
2) Hive to ClickHouse
DataX now supports Hive to ClickHouse and some of our data is imported directly from DataX to ClickHouse. However, DataX only supports import for the time being, because it is necessary to ensure high availability on the line, so it is not enough to just import, we still need to rely on our above set of processes for ReName, incremental data update, etc.
Aiming at the high availability of data, we designed the data update mechanism as follows:
3.1.1 Full Data Import Process
The process of importing full data is relatively simple. You only need to import data to temporary tables first. After the import is complete, ReName the official and temporary tables to switch data reading from old data to new data.
3.1.2 Incremental Data Import Process
We have used two versions of the incremental data import process.
Because ClickHouse’s DELETE operation is too heavy, it was initially implemented by deleting a specified partition and importing incremental data into a formal table.
This method has the following problems: First, data accuracy cannot be guaranteed during incremental data import. The more incremental data is imported, the longer the data will be unavailable. ClickHouse deletes a partition asynchronously after receiving the delete command. The completion time is unknown. If the delete instruction is executed asynchronously after the incremental data is imported, the incremental data will be deleted. The latest update log says the problem has been fixed.
In response to the above situation, we modified the synchronization scheme for incremental data. After the incremental data is synchronized from Hive to ClickHouse’s temporary table, reverse write the data from the official table to the temporary table, and then use the ReName method to switch between the official and temporary tables.
Through the above process, users can be basically unaware of the data import process.
3.2 Monitoring and warning during data import
Due to the large amount of data, data synchronization statements often time out. To ensure that every step of the data synchronization process is monitored, we do not use the JDBC provided by ClickHouse to execute data synchronization statements. All data synchronization statements are executed by calling ClickHouse’s RestfulAPI.
When calling restful APIS, you can specify the QueryID for this query. Polling to get the execution progress of a QueryID in the case of a data synchronization statement timeout. This ensures the orderly running of the whole query process. During polling, the JOB records exceptions. If the frequency of exceptions exceeds the threshold, the JOB sends alarm SMS messages to related personnel.
3.3 Server Distribution, OPERATION and Maintenance
At present, four clusters are mainly divided into domestic, overseas/supplier, real-time data and risk control data according to the scenario. Each cluster corresponds to two to three servers, between the master and standby, the program will query requests distributed to different servers to do load balancing.
If a server is faulty, you can modify the server nodes of a cluster on the configuration interface, so that the requests of the cluster will not be sent to the faulty server. If a particular data query volume is high at a certain time, form a virtual cluster and distribute all requests to other physical clusters with abundant resources.
In the second half of this year, we plan to disperse the two machines in each cluster to different machine rooms, which can continue to play the role of the existing master/standby, load balancing and Dr. At the same time, in order to guarantee the high availability of online applications, we will implement an automatic health detection mechanism to automatically pull out our virtual cluster for sudden abnormal servers.
We will monitor the daily query volume of each server, execution time of each statement, server CPU and memory indicators, so as to timely adjust the requests with high query volume on the server to other servers.
ClickHouse usage exploration
We have encountered a variety of problems with ClickHouse, which are summarized for your reference.
1) Disable Linux virtual memory. When the ClickHouse server ran out of memory and we killed the Query that occupied the most memory, the ClickHouse server did not recover as expected and all queries continued to run very slowly.
The virtual memory usage is abnormal by viewing server indicators. Because there is a lot of data exchange between physical memory and virtual memory, the query speed is very slow. After the virtual memory is shut down and the service is restarted, the application is restored.
2) Add the JOIN_USE_NULls configuration for each account. ClickHouse’s SQL syntax is non-standard. By default, if a record in the Left table does not exist in the right table, the corresponding field in the right table returns the default value for the corresponding data type of the field, rather than the Null value in standard SQL. For those of us who are used to standard SQL, this return value is often confusing.
3) Make sure to place the table with a small amount of data on the Right side of the ClickHouse. The Left JOIN, Right JOIN, or Inner JOIN always takes each record from the Right table to the Left table to check whether the record exists, so the Right table must be small.
4) When writing data in batches to ClickHouse via ClickHouse’s official JDBC, you must control the number of partitions involved in each batch of data. It is best to sort the data to be imported using the Order By statement before writing. Unordered data or too many partitions in the data can cause ClickHouse to fail to merge newly imported data in a timely manner, affecting query performance.
5) Minimize the amount of data in the left and right tables during JOIN. If necessary, aggregate a table in advance to reduce the number of data items. In some cases, GROUP BY followed BY JOIN takes less time than JOIN followed BY GROUP BY.
6) The ClickHouse version iterates quickly, so we recommend using last year’s stable version. We have encountered some bugs in the new version, including memory leaks, syntax incompatibables that do not report errors, and configuration file concurrency changes that do not take effect.
7) Avoid using distributed tables. ClickHouse’s distributed tables are not as cost-effective as physical tables. Do not create a table with too many partitioned field values.
8) The server CPU typically fluctuates around 50% of the time, and 70% of the TIME leads to extensive query timeouts, so ClickHouse’s most critical metric is the CPU. We monitor all ClickHouse queries internally and have email alerts when queries fluctuate.
9) Query test Case: Associating 6000W data with 1000W data and then associating 2000W data sum Night amount in one month 240 million data are associated with 2000W data group by about 390ms of data in a month. However, ClickHouse is not all-powerful. Queries need to be constantly tuned, depending on the query criteria, and depending on whether the query criteria table is left join or right join.
Five, the summary
ClickHouse has been connected to more than 80 percent of the hotel’s business since its launch in July last year. Meet more than one billion data updates and nearly one million data queries every day, support app performance 98.3% to return results within 1 second, PC 98.5% to return results within 3 seconds.
From the perspective of use, the query performance is not compared to the database, and the cost is far lower than the cost of the relational database. There is no pressure to support more than 4 billion data queries on a single machine. Compared to ElasticSearch and Redis ClickHouse fits most of our usage scenarios.
We will continue to dig deeper into ClickHouse and keep up with the latest releases, as well as keep looking for better open source frameworks out there to try and find ones that are more appropriate for us.
【 Recommended reading 】
-
Application exploration of reinforcement learning in Ctrip hotel recommendation ranking
-
Ctrip Flight Ticket Sketch plug-in development practice
-
A preliminary study on Ctrip ticket App Kotlin Multiplatform
-
Fast integration into cloud native, Ctrip open source Dubbo for Go version
-
When you search in Ctrip, how does the recommendation system work