Author: A push back development engineer, Zhao Zhiqiang


Twitter has been providing notifications to developers for years. Through the push SDK, the mobile terminal establishes a long connection with the server to maintain the online status. However, in the case of network exceptions, the message cannot be sent to the end user in real time, so the push server establishes a list of offline messages for the user to re-log in and deliver the message. This part of data is stored in a push Redis cluster. The whole cluster includes more than 100 instances of master and slave, the number of key is at the level of 1 billion, and the storage space is at the level of T, which brings some maintenance costs and operation and maintenance challenges. As the back-end development engineers, we are always looking for cost-effective solutions.

The QPS of the whole cluster is in the million level. If Aerospike is selected and compared with the actual measurement, we find that a single physical machine equipped with a single Inter SSD 4600 can achieve QPS close to 10W, that is, dozens of machines can meet the existing demand and support the business demand for a long time in the future.

The advantage of the Aerospike

Aerospike is a high-performance, scalable, and reliable NoSQL solution that supports RAM and SSDS as storage media, optimized specifically for SSDS, and widely used in real-time computing applications such as real-time bidding. It provides automatic cluster Rebalance and cluster-aware client functions. It also supports storage of very large data sets (100T level).

As KV storage, Aerospike provides multiple data types and operates in a similar way to Redis. In addition to basic functions, Aerospike also supports multiple monitoring modes, such as AMC console and API, and monitors cluster QPS, health, and load. It is o&M friendly. Supports automatic Rebalance of data within a cluster, reducing maintenance costs compared to Redis cluster solutions.

In this paper, some experiences in gray scale deployment and use of Aerospike are shared, hoping to provide some reference for readers who are investigating or are ready to use Aerospike. In addition, the grayscale concept is not limited to Aerospike itself, but can also bring some reference to the migration and planning of other basic components.

Data Model description

Aerospike uses schemaless storage and an RDBMS-like data model, making it relatively user-friendly to understand and use:

Each namespace contains multiple sets, each set contains multiple records, and each record contains multiple bin(database column). Records can be queried by index key. Different services can use different namespaces in the same cluster to isolate resources, thereby maximizing resource utilization.

Gray line process

Gepush uses a large Redis cluster for offline message list storage. We have investigated SSDB, PIKA and other Redis supported disk storage, the overall calculation, Aerospike is the most cost-effective.

In the early stage, we combined the online scenario to simulate the actual read/write ratio (analyzing online business, we found that the roughly write/read ratio was between 1:1 and 1:2) to conduct pressure testing, evaluate and verify the feasibility, and then carry out production planning.

Online business is quite complicated, so it is impractical to cut into Aerospike directly and it is risky. Test network simulation verification is difficult to expose the problems that may occur in the production environment, so we divided the whole on-line process into observation stage and grayscale stage. In the observation phase, as the name implies, the original Redis cluster is still responsible for online read and write business, but a copy of the same traffic is imported into Aerospike for real pressure verification. In the gray phase, the online business is gradually switched to the Aerospike cluster, and the gray scale is expanded to ensure the stable operation of the cluster until the business is completely switched to Aerospike. The specific operations of the two stages are as follows:

Observation phase: After the Redis operation succeeds, the read and write operations on Redis are asynchronously synchronized to the Aerospike. The Aerospike does not undertake specific services. The next step is to double write data to Redis and Aerospike. In this stage, we mainly observe whether the data on both sides are consistent, Aerospike pressure, etc. During the observation phase, you can perform o&M operations such as node restart and cluster expansion, evaluate o&M costs, and optimize configurations. You can use the AMC Page console, the monitoring API to monitor the cluster status, and the client invocation section to record the necessary logging and monitoring information.

Grayscale: Aerospike begins to take on off-line message list storage for some applications and tasks. In gray phase, Redis and Aerospike data are both written and clear, and the hot backup status is maintained until Redis data is completely switched to Aerospike and runs stably for a period of time.

The observation phase is very important, which is basically an online assessment of the feasibility of the whole scheme. In this phase, the observation point is divided into two parts: the Client (AS-client) and the Server (AS-Server). Main client observations:

1. Use metrics to monitor client request and response time. Use metrics to measure the system SLA by the percentage of request time (50%, 90%, 99%, 99.9%) over a period of time.

2. Monitor the count of read and write success and failure.

3. Set the slow log threshold to 50ms to collect statistics on slow logs during peak periods and normal periods.

4. Write asynchronously to the Aerospike queue monitor and adjust the queue size appropriately.


Main observations on the server:

1. Cluster health.

2. Disk and memory usage and memory/disk space ratio.

3. Information about machine I/O load, CPU load, and disk fragmentation.

4. Cluster throughput, whether read and write TPS can be comparable to online Redis cluster.

5. Check data consistency. How to check the consistency of the two data in the observation stage and the gray stage? Comparing differences by key is difficult to meet the performance requirements. In the case of data consistency, the data found by Redis should be identical with that found by Aerospike. Therefore, the data query results of Redis and Aerospike should be sampled and recorded in the log, and the ratio of inconsistent data within 1 minute, 5 minutes, 30 minutes and 1 hour should be compared and analyzed. If the number of keys on the line is in the billion level, even if the difference is only one part in 10,000, the inconsistency is significant. In this case, the cause of the inconsistency needs to be identified and resolved.

Based on our experience, we simulated some typical operation and maintenance scenarios, considering the performance degradation caused by automatic cluster Rebalance.

1. Simulate the impact of the cluster Rebalance on system performance caused by a single node failure.

2. Simulate the impact of the cluster Rebalance on system performance caused by cluster capacity expansion.

Calculates the Rebalance speed of the cluster during the day and night based on the impact on online services. Supports cron job updates.

4. The node restarts.

5. Mount an SSD.

6. Optimize related configurations.

To sum up, the complete on-line process is divided into the following steps:

0. Simulate the environmental pressure on the line and verify the feasibility.

1. Encapsulate the Aerospike client into a Redis-like interface, add necessary logs, monitoring items, and Bin validity check.

2. The message service integrates with the Aerospike client. The required functions include Aerospike asynchronous read/write, service data source switching, and traffic filtering.

3.QA function verification.

4. Apply for resources and deploy the Aerospike cluster online.

5. The Aerospike integrated message service goes online.

6. After passing the verification in the observation stage, it enters the grayscale stage until it finally goes online or is withdrawn midway.

experience

Some of the problems and challenges we encountered during the use of Aerospike are summarized as follows:

1.Aerospike uses single-bin mode to save space.

2.Aerospike does not store the original key. Instead, it indexes a 20-byte hash value of the original key. Even though the key and value values have a small number of bytes, the key itself occupies 20 bytes, so the actual space used is relatively large.

Aerospike Rebalance data when a node is down or is changing nodes. This affects service quality. The Rebalance speed can be controlled, so you need to balance quality of service against rapid cluster recovery.

4. The community version has to rebuild the index every time the cluster restarts, and then load it into memory, which leads to slow speed. Namespaces must be specified in the configuration file. Therefore, you are advised to allocate namespaces that may be used in the future based on services to reduce unnecessary restarts.

5. Because SSD itself has problems of fragmentation and write amplification, in actual use, we found that if the disk space usage is around 50%, the performance will be seriously degraded. Therefore, parameters related to defragmentation can be optimized based on actual business.

6. The Aerospike has restrictions on hotkeys. Therefore, the Aerospike returns a HotKey error (Errorcode 14) when frequently reading or writing a key. The server can increase the number of concurrent operations on the same key by increasing the transaction-pending-limit configuration. The default value is 20, and a value of 0 indicates no limit. Increasing this configuration may reduce performance to some extent. The client may need to add a retry to handle this exception, but this may put the HotKey at further risk.

7. This fundamental component change must be done with on-line flow pressure testing whenever possible to expose potential problems as early as possible.

8. In the observation phase, the operation and maintenance costs should also be evaluated to avoid jumping from one pit to another.

9. You should also pay attention to the inherent limitations of Aerospike. For example, a namespace can contain up to 1023 sets, a bin name can contain up to 14 single-byte characters, and a namespace can support up to 64 SSDS. Specific reference: AEROspike_KNOWn_limitations.

conclusion

Aerospike is a high-volume NoSql solution that is not widely available in domestic manufacturing. It is suitable for large capacity requirements, relatively low QPS some scenarios, to a certain extent can save TCO. Aerospike is similar to Redis in terms of support commands, and business migration is easier. It naturally supports cluster deployment and is friendly to monitoring and o&M support. Despite these excellent features, it is important to be careful with the technology selection, assessing in advance whether it meets your business scenario and whether the performance and cost will meet your requirements. In some official test scenarios, it performed better than Redis, and in fact, due to the limitations of SSDS themselves, it lagged behind Redis in QPS in most cases. Finally, the online traffic must be verified before going online, and the actual online business should be processed in grayscale mode to minimize the impact on user experience.