This article is shared in the huawei cloud community programmer Core Evaluation: A Comprehensive evaluation of GaussDB(for Redis) and Open Source Redis, written by The official blog of Gauss Redis.
At the time when the digital transformation of enterprises is accelerating comprehensively, the business demand is increasing rapidly, along with the exponential growth of data volume and concurrent visits, the traditional relational database is unable to deal with massive big data. Due to the limitation of locality principle, when traditional database is used to process big data flow, a large number of data items in the table are empty when the table is initially built, which is disastrous for traditional database. In relational database, the table structure must be defined when building a table, and the dynamic increase and decrease of fields has a huge impact on performance. At the same time, the storage with a large number of empty value sparse matrix will lead to a sharp increase in storage cost, which gives NoSQL database with great development space, and Redis, as the leader of NoSQL, is highly sought after by the industry.
Redis 5.0 provides nine data types, including String, List, Set, ZSet, Hash, Bit Array, HyperLogLog, Geo, and Streams, and operations based on these data types. Give developers more choice in how to express data and how it relates to each other. However, the open source Redis is not a database at all, and has several weaknesses:
- The application scenarios are limited and can only reach the level of weak consistency. It can only be used in the cache layer.
- High usage cost. Because the open source Version of Redis uses memory as the main storage, the high price of memory makes the cost of building Redis cluster very high.
- If you use the open source version of Redis regularly, you must know that you often encounter problems such as blocking and jitter during operations such as large key replacement, capacity expansion and reduction.
Therefore, in this paper, we choose GaussDB(for Redis), which has a good reputation in the industry, for comparison and evaluation, in order to recommend a real high-quality and inexpensive Redis to readers. Let’s talk about it in detail:
1. Stable and reliable
When using the cache scheme, the scenario that users are most afraid to see is the performance decline of Redis due to various reasons, so that the pressure of business flow is directly transferred to the core database, and finally cause the avalanche of the whole system. It can be said that the operation stability is the primary issue that users need to consider when selecting the cache database. Therefore, we choose three scenarios commonly encountered in daily work to carry out the comparison and evaluation of the open source version Redis and GaussDB(for Redis) on the overall usage experience level.
1.1. Disaster Recovery Switchover Scenario
First, by forcibly shutting down the master node on the console, the failure of the open source Redis master node and the failure of GaussDB(for Redis) single node are simulated. GaussDB(for Redis) uses a load balancing scheme in which multiple nodes work at the same time. Therefore, the concept of the active node in Redis is not open source. Therefore, when a node breaks down during a Dr Switchover, the Dr Switchover scenario can be simulated. Details are as follows:
During the switchover, the services of GaussDB(for Redis) are migrated to the standby node in an instant. In addition, the performance deteriorates not significantly during the switchover. It takes only 1s to recover. In contrast, the open source version of Redis will completely interrupt the service when switching, and it takes an average of 5s to recover. I can say that GaussDB(for Redis) does not have to worry about crashing the entire transaction link in the event of an exception on the primary node, but the open source Redis does have to worry about switching.
1.2. Node Fault Recovery scenario
Abnormal node restart is a common problem in the use of Redis. As we know, when the open source Redis service is restarted, metadata, namely RDB, needs to be loaded. The larger the Key value is, the larger the RDB will be and the slower the restart speed will be. In contrast, when GaussDB(for Redis) is restarted, metadata is not reloaded. Therefore, the restart speed is fast. The restart speed is not affected by the amount of data.
1.3. Backup Scenario
Fork operation is a standard POSIX operating system interface. When it is called, a series of actions such as memory allocation and copy will be generated. This will greatly increase the delay and cause obvious jitter phenomenon in daily operation and maintenance operations such as backup of open source Redis. Even in extreme cases, the memory usage will soar, triggering the OOM protection of the operating system and causing the Redis process to stop abnormally. Therefore, the open source version of Redis can only carry out backup operations in the business downturn period, which is not applicable to the system with 24-hour transaction peak. However, GaussDB(for Redis) solves the backup fork problem from the bottom of the system. There is no fork operation during backup. Data can be prepared at any time and is not disturbed at peak times.
In terms of overall user experience, GaussDB(for Redis) almost beats the open source version of Redis in terms of stability, which can be regarded as the anchor in the user system. In the following, we will evaluate the specific expansion, performance, price and other aspects in detail.
2. Second capacity expansion
Readers engaged in the operation and maintenance of open source Redis will generally fall into the hole of capacity expansion. Because open source Redis uses memory as data storage, data synchronization between nodes is carried out using Raft protocol, which makes the operation of open source Redis capacity expansion more complicated, especially in the memory expansion of servers. Generally, you need to restart the ECS instance where the ECS is located. These operations must be accompanied by the switchover and restart of the primary node, which temporarily interrupts services.
2.1. Comparison of Service Performance during Capacity Expansion
Open source Redis is a weak consistent cache database, even if the data does not fall disk persistence may return to call method write success, so once the switch occurs, then inevitably appear dirty read phenomenon, so expansion can be said to be the open source version of Redis in the daily operation and maintenance process of the most headache for users.
We also evaluate GaussDB(for Redis) and open source Redis for capacity expansion scenarios:
2.2. Comparison of convenience of expansion operation
In terms of operation, expansion of GaussDB(for Redis) is much easier than open source Redis. You only need to perform simple operations on huawei cloud OS.
In the author’s nearly 100 continuous capacity expansion operations, GaussDB(for Redis) has never experienced service interruption or data loss. Capacity expansion operations are completed in seconds with no external awareness. Compared with the open source version, complex operations require continuous restart and switchover, and the average service interruption during capacity expansion is about 25 seconds.
The monitoring system of GaussDB(for Redis) is better than that of open source Redis. It monitors key performance indicators, such as request delay, automatically removes faulty nodes, smoothly moves them, automatically alarms, and automatically recovers faulty nodes, providing users with excellent O&M experience.
3. Strong performance
Compared with memory, disks have great advantages in both stability and maintainability. In essence, GaussDB(for Redis) benefits from disks as data storage devices. Memory has the advantage of high speed. Before the evaluation, I thought that GaussDB(for Redis) might be different from the open source version, but the actual test results proved that the difference between GaussDB(for Redis) and the open source version was very small under the same specifications.
3.1. Performance Comparison in common scenarios
At the 16GB specs we tested, GaussDB(for Redis) was even slightly better than open source Redis, as follows:
3.2. Performance Comparison for Large-key replacement scenarios
GaussDB(for Redis) also fully solves the problem of key replacement lag in open source Redis. In the scenario of simulating large key replacement, the details are as follows:
The performance of GaussDB(for Redis) barely suffers when large keys are replaced and read/write operations are performed, which is amazing compared to the 20% drop in open source versions.
3.3. Performance password of GaussDB (for Redis
The reason behind this analysis lies in that GaussDB(for Redis) solves the problem caused by the batch key replacement of open source Redis through the storage layer Shared Everything. GaussDB(for Redis) ‘s separation of storage and computation not only shields the low-level details of different independent databases, but also encapsulates various capabilities into modules. Users can select suitable components and customize database services according to their own service requirements at a very low cost. In this process, a complete user experience consistency is also guaranteed.
GaussDB(for Redis) uses hash strategy to balance data, avoiding STW problems caused by Full GC. The application of these technical solutions enables GaussDB(for Redis) to perfectly solve the problem of STW caused by open source versions. Not at all.” GaussDB(for Redis) uses the hot and cold separation technology to dynamically discover hotspot data and orderly load hotspot data into the memory. Customers hardly notice the performance difference between GaussDB(for Redis) and open-source Redis.
GaussDB(for Redis) has the advantages that open source Redis does not have, that is, it is not limited by memory capacity and supports PB-level cache clusters. The larger the scale, the better the performance of GaussDB(for Redis). GaussDB(for Redis) performance can easily reach millions of QPS.
Thanks to the GaussDB(for Redis) load balancing architecture, as long as one node in the cluster is available, the entire cluster can normally provide external services. In normal cases, each node can support writes. This architecture ensures high external performance in normal cases, and supports n-1 fault tolerance in abnormal cases. Comparison in performance and DISASTER recovery GaussDB(for Redis) is ahead of the open source Redis.
4. Extreme cost performance
In order to compare the cost performance of the two, we purchased 3 ECS cloud servers with 2C/16G and 4C/64G specifications from a certain manufacturer, built open source Redis version cluster, and compared the price with GaussDB(for Redis) with the same specifications on Huawei cloud. The comparison of specific performance indicators has been listed in the previous section and will not be repeated here.
We can see that GaussDB(for Redis) is nearly twice as cost-effective per unit of data as open source Redis.
The evaluation results show that GaussDB(for Redis) is an excellent product that exceeds the open source version in almost all aspects. The evaluation results are summarized as follows:
GaussDB(for Redis) is based on Huawei high-performance distributed shared storage pool. It avoids many problems of open source Redis, such as primary/secondary stacking, inconsistent primary/secondary stacking, fork jitter, only 50% memory utilization, large key blocking, and gossip cluster management. All the purchased capacity is available. And the cost per unit of data is half that of the open source version. Developers, if not now when will GaussDB(for Redis) be tried?
Click to follow, the first time to learn about Huawei cloud fresh technology ~