The authors introduce

Biao Yang is a technical expert at Ant Financial. He is the author of Distributed Service Architecture: Principles, Design and Practice and Scalable Service Architecture: Frameworks and Middleware. With nearly 10 years of working experience in the Internet and game industry, HE once held core R&D positions in kuwo Music Box, Renren Games and Zhangqu Technology and other listed companies. He also worked on projects with daily active users reaching 10 million and several games with monthly turnover of more than 10 million.

This article is adapted from the forthcoming book Scalable Service Architecture: Frameworks and Middleware by Yanpeng Li, Biao Yang, Hailiang Li, Boyan Jia and Hao Liu

Today, caching solutions on the market are maturing, and TODAY I will compare some of them, including Redis, Memcached, and Tair, to help you choose your technologies better in your production practice.

A comparison of common distributed caches

Commonly used distributed caches include Redis, Memcached and Alibaba’s Tair (see table below). Redis is widely used because of its rich and easy-to-use data structure.

Please click here to enter a description of the picture

Here are nine ways to compare the most commonly used Redis and Memcached.

1. The data type

Redis supports a total of five data types, each of which corresponds to a different data structure, including simple String type, compressed String, dictionary, skip list, etc. Skip table is a relatively new data structure, often used for high-performance search, can reach log2N query speed, and compared with red black tree, skip table changes fewer nodes during update, easier to achieve concurrent operations.

Memcache only supports the storage of key-value pairs, and does not support other data structures.

2. Threading model

Redis uses single-threaded implementations, Memcache and others use multi-threaded implementations, so it is not recommended to store too much content in Redis, otherwise it will block other requests.

Because caching operations are memory operations with very little computation, it performs well under a single thread. Redis implements a single-threaded, non-blocking network I/O model, which is suitable for fast operation logic, but can affect performance when there are complex long logic. For long logic, multiple instances should be configured to improve multi-core CPU utilization. That is, multiple instances can be configured using a single machine with multiple ports. The official recommendation is 8 instances per machine.

The non-blocking I/O model is based on the two files about Epoll in Libevent library and the event notification model implemented by the author. It is simple and compact. The idea of the author is to keep the implementation simple and reduce dependence. Because there is only one thread in the server, pipes are provided to consolidate requests and batch execution, reducing the time consumed by communication.

Memcache also uses a non-blocking I/O model, but with multi-threading, it can be applied to a variety of scenarios. The logic of a request can be large or small, long or short, and there will not be a scenario in which one logically complex request blocks a response to another request. It relies directly on the Libevent library implementation, which is complex and loses high performance in some specific environments.

3. Persistent mechanism

Redis provides two persistence mechanisms, including RDB, which is a timed persistence mechanism but may cause data loss in the event of an outage, and AOF, which is based on operation logs.

Memcahe does not provide persistence mechanism, because the design concept of Memache is to design a simple cache, the cached data is temporary, should not be persistent, nor should it be a big data database, when the cache does not hit the source query database is natural. However, its persistence can be supported through the third-party library MemcacheDB.

4. The client

The common Redis Java client Jedis uses blocking I/O, but can configure connection pooling and provides consistent hash sharding logic, as well as the open source client sharding framework Redic.

Memecache clients include Memcache Java Client, Spy Client, and XMemcache. The Memcache Java Client uses blocking I/O, while the Spy Client/XMemcache uses non-blocking I/O.

As we know, blocking I/O does not require additional threads. Non-blocking I/O opens additional requesting threads (in the Boss thread pool) to listen on the port. A request is released after processing (in the Worker thread pool). The response data is then processed and written back to the network Socket connection, so in theory the throughput and responsiveness of non-blocking I/O should be higher.

5. High availability

Redis supports the replication configuration of the primary and secondary nodes, which can be synchronized and recovered using the RDB and cached AOF commands. Redis also supports high availability clustering solutions such as Sentinel and Cluster (as of version 3.0).

Memecache does not support the high availability model and can be implemented by using a third-party Megagent agent, which can be connected to another instance when one instance goes down.

6. Support for queues

Redis lpush/brpop support itself, the publish/subscribe/psubscribe queue and subscription model.

Memcache does not support queues and can be implemented using third-party MemcachQ.

7. The transaction

Redis provides several commands that support thread safety and transactions to a certain extent, such as Multi /exec, Watch, Inc, etc. Since the Redis server is single-threaded, any server operation command on a single request is atomic, but the operation across clients is not guaranteed atomicity, so transactions are not guaranteed for multiple sequences of operations on the same connection.

Memcached’s single command is also thread-safe. Multiple command sequences in a single connection are not thread-safe. Memcached also provides thread-safe auto-add commands such as Inc and gets/ CAS for thread-safe.

8. Data elimination strategy

Redis offers a wealth of elimination strategies, Includes maxMemory, maxMemory-Policy, volatile- LRU, AllKeys-lRU, volatile-random, AllKeys-random, volatile- TTL, Noeviction (return The error) and so on.

When the Memecache capacity reaches the specified value, the Memecache automatically deletes unused caches based on the Least Recently Used (LRU) algorithm. In some cases, the LRU mechanism can be a nuisance, wiping unwanted data from memory. In this case, you can disable the LRU algorithm with the “M” parameter when starting Memcache.

9. Memory allocation

In order to shield the difference between different platforms and count the memory usage, Redis has a layer of encapsulation for the memory allocation function, and uses zmalloc and Zfree series functions in the program. These functions are located in zmalloc.h/zmalloc.c file. The purpose of encapsulation is to shield the differences of underlying platforms and to facilitate the implementation of relevant statistical functions. The specific implementation methods are as follows:

  • If Google’s TC_MALLOC library exists in the system, the functions of the TC_MALLOC family are used instead of those of the malloc family.

  • If the current system is a Mac system, the system memory allocation function is used.

  • In other cases, an additional fixed-size field is allocated in front of each allocated space segment to record the size of the allocated space, which is a simple and efficient way to allocate memory.

Memcache uses slab table to allocate memory. Firstly, available memory is classified according to different sizes, and the block allocation closest to the required size is found according to requirements to reduce memory fragmentation. However, this requires reasonable configuration to achieve the effect.

As can be seen from the above comparison, Redis is simpler to implement and use, but more powerful, more efficient and more widely used. The following is a preliminary introduction to Redis to give beginners a first experience of learning guidance.

2. Initial experience of Redis

Redis is an open source key-value storage system capable of storing a variety of data objects. Written in ANSI C, Redis can be used only as an in-memory database, or as a logging database system, and provides multiple language apis.

1. Usage scenarios

Redis is often used as a non-local cache, with little use for its advanced features. One of the most problematic uses of Redis is to store JSON data, because Redis doesn’t support JSON data as well as Elasticsearch or PostgreSQL. So we used to treat JSON as a big String and put it in Redis, but now JSON data is serially nested, with each update fetching the entire JSON and changing one of the fields before putting it in.

A common Java object definition for JSON data is as follows:

public class Commodity {

private long price;

private String title;

}

In the case of a large number of requests, each update of a field in Redis, such as the sales field, will generate a large amount of traffic. In real life, JSON strings can be complex enough to be hundreds of kilobytes in size, causing network I/O to run full and even system timeouts and crashes during frequent data updates.

For this reason, Redis officially recommends using hashes to save objects. For example, if we have three commodity objects with ids 123, 124, and 12345, we can save them in Redis by hashing them. When updating the fields in Redis, we can do this:

HSET commodity:123 price 100

HSET commodity:124 price 101

HSET commodity:12345 price 101

HSET commodity:123 title banana

HSET commodity:124 title apple

HSET commodity:12345 title orange

That is, the item’s type name and ID form the KEY of a Redis hash object. This is all you need to do to get a single property: HGET Commodity: 12345.

2. Redis’s high availability solution: Sentry

Redis officially launched a cluster management tool called Sentinel, which is responsible for selecting the master node from the nodes, operating the online, offline, monitoring, reminding and automatic failover (master/standby switchover) of cluster nodes according to the distributed cluster management method, and implementing the famous RAFT master selection protocol. Thus, the consistency of system master selection is guaranteed.

A common deployment scheme for sentinels is presented here. At least three sentinels can be deployed on the same VM as the monitored node. The common sentinels are shown in the figure.

Please click here to enter a description of the picture

In this system, machine A in its initial state is the master node, and machine B and C are the slave nodes.

Since there are three sentinels, each machine running one sentinel, set quorum = 2, that is, after the master node does not respond, at least two sentinels cannot communicate with the master node, the master node is considered to be down, and a new master node is elected from the slave nodes to use it.

When A network partition occurs, if the host network of machine A is unavailable, the two Sentinel instances on machine B and MACHINE C start failover and elect machine B as the primary node.

The Sentinel cluster feature ensures that the two Sentinel instances on machine B and MACHINE C are updated with respect to the master node. But the Sentinel node on machine A still has the old configuration because it is isolated from the outside world.

After the network is restored, we know that the Sentinel instance on machine A will update its configuration. However, if the host node to which the client is connected is also isolated from the network, the client will still be able to write data to the Redis node of machine A, but after the network is restored, the Redis node of machine A will become A slave node, so during the network isolation period, It is inevitable that data written by the client to the Redis node of machine A will be lost.

If Redis is used as a cache, we may be able to tolerate the loss of this part of data, but if Redis is used as a storage system, we cannot tolerate the loss of this part of data, because Redis adopts asynchronous replication, in such a scenario, data loss cannot be avoided.

Here, we can configure each Redis instance so that data is not lost:

min-slaves-to-write 1

min-slaves-max-lag 10

According to the above configuration, when a Redis is the master node, if it cannot write data to at least one slave node (the min-slave-to-write above specifies the number of slaves), it will refuse to receive write requests from clients. Since replication is asynchronous, the failure of the master node to write data to the slave node means that the slave node either disconnects or does not send a request to synchronize data to the master node within the specified time.

Therefore, such a configuration can eliminate the scenario where the primary node is isolated but still writes data after a network partition, resulting in data loss.

3. Redis cluster

Redis also introduced the concept of cluster in 3.0 to solve some problems of large data volume and high availability. However, in order to achieve high performance, cluster is not strong consistency, but asynchronous replication is used. After data is sent to the master node, the master node returns success, and the data is asynchronously copied to the slave node.

First, let’s learn about Redis’s clustering sharding mechanism. Redis uses CRC16(key) mod 16384 for sharding, with a total of 16384 hash slots. For example, if the cluster has three nodes, we allocate hash slots according to the following rules:

  • Node A contains hash slots from 0 to 5500.

  • Node B contains 5500-11000 hash slots;

  • Node C contains hash slots ranging from 11,000 to 16384.

Here, three master nodes and three slave nodes are set. The cluster fragment is shown in the figure.

Please click here to enter a description of the picture

In the figure, there are three replication nodes of the master and slave servers of Redis. Any two nodes are interconnected, and the client can connect to any one of them, and then access any node in the cluster for access and other operations.

So how does Redis do it? First, hash slot information is stored on each node of Redis, which can be understood as a variable that can store two values, ranging from 0 to 16383. Based on this information, we can find the hash slot that each node is responsible for, and thus find the node where the data resides.

Redis cluster is actually a cluster management plug-in, when we provide a access keyword, we will get a result according to CRC16 algorithm, and then divide the result by 16384 to get the remainder, so that each keyword will be corresponding to a hash slot number 0-16383, through this value to find the node corresponding to the corresponding slot. Then jump directly to the corresponding node for access operation. But this is all done internally by the cluster, and we don’t need to do it manually.