Ramble on Web caching architecture

Many places use cache in computing, for example in order to alleviate the speed between the CPU and memory not matching problem, we often by increasing primary, secondary and tertiary cache, the CPU fetches instructions from the cache first, if not, again from the memory, and update the cache, at the same time, according to the principle of application of locality, make the most of the time the cache hit.

At present, the core data of Web applications are usually stored in the database, such as user information, order information, transaction information, etc. At the same time, database and programming language are irrelevant. Through SQL interaction, programs written in Java, Php and other languages need to access the database, execute business logic, and show the results to users. However, database has some limitations, such as: 1. Database connections are very “expensive” resources, in order to reuse these resources, the connection pool technology is currently used, 2. The number of connections in the connection pool is limited. If there are too many users, the system must wait. 3.

From the above introduction, we know that the database in a large system is often the bottleneck, we can not access the database every time, especially in the case of multi-user, large concurrency. What is the usual approach to this situation? Every problem in the computer industry can be solved by adding a layer of abstraction. So, for the database bottleneck, we can add a layer in the application layer and database layer, namely the cache layer.

How to implement caching

If you are the lead architect for a large company and the company needs to develop its own caching system, how do you design it? I think the following questions should be considered before design:

What format of data is stored in the cache?
How does the application (client) access the cache?
What if the application runs out of cache space?
Do you want to support distributed storage (data sharding) and how?

1. What format of data is stored in the cache?

At present, common data formats include serialized object, XML, JSON, string (key,value) and basic data structure. Among them, serialized object for Java language has serialization and deserialization, while Protobuf developed by Google is language-independent. For example, Python serializes an object. Java can deserialize this object.

2. How does the application access the cache

Given that the company has many back-end groups and uses different programming languages, it was required that the caching system we developed should be programming language-neutral, so we needed to develop a protocol to support each language. How does a client use this protocol? The most common is the client/server model. First, the server listens for requests; Then, the client sends a request and gets a response. The request sent by the client is the protocol. Finally, based on Socket communication. For example: set ‘name’ ‘mukedada’, get name.

3. What if the cache space is used up?

The cache server should set the cache size at startup and use the LRU algorithm when the cache is full.

4. Implement distributed storage

For large application servers, a single cache server cannot support them. Then, it is necessary to scale the cache server horizontally (that is, add and delete servers, when the activity is over, it is necessary to consider deleting servers), so what algorithm is used to make the data relatively evenly distributed to each server? Also, should the algorithm be on the client side or the server side?

Client-side implementationNote that the client here refers to the Web application service, and the server list information passesThe configuration fileTo obtain. When the number of nodes changes, the client needs to be notified.

A typical use of this is Memcached, which uses a consistent Hash algorithm. Before introducing it, let’s look at the remainder algorithm. For the (key, value) that the user wants to store, the storage server is determined by calculating the integer hash value of the key and then modulating the number of servers. There is a fatal problem with this approach: when the number of servers sent changes, the remainder changes, requiring you to retrieve the data from the database again. To deepen your understanding, let’s take a concrete example: Suppose there are 3 servers 0, 1, 2, key1, key2 hash value is 100,99 respectively, the corresponding remainder is 1 and 0 respectively, that is, they are stored in the number of 1 and 0 respectively, now if a new server 3, then their remainder will change, 100%4 = 0, 99%4 = 3, but they can’t find the corresponding data on server 0 and server 3.

In order to solve the problem of remainder algorithm, our predecessors proposed distributed consistent hash algorithm. The idea is to minimize the impact when the number of servers changes. For example, when we add node5, only the local key is affected, while the remainder algorithm is affected globally.

However, it also has the problem of uneven distribution, resulting in more data on some servers and less data on others. One approach is virtual nodes, where a server is represented as multiple virtual nodes distributed over a ring. Memcache takes this approach.

Proxy implementation proxies are placed on the server side, and typical examples are Twemproxy and Codis. The basic idea is that the application sends a request to the Proxy, and the Proxy calculates the node from which the data should be obtained through a certain algorithm, and returns a response to the client. To prevent a single point of failure of Proxy, you can implement high availability of Proxy in a cluster.
The typical example of a route implementing this is Redis. The basic idea is that an application can send a request to any node, and when the node contains the request data, the response is directly returned to the application, and when the node does not contain the request data, it is told to jump to another node for the data, where the client library needs to parse the corresponding instructions. For example, when there is no data in node1, the client program is given access to Node3. This is similar to redirection on the Web, with the disadvantage that Node1 needs to know about the data of other nodes, i.e. node1 communicates with other nodes.

First, it has 16,284 slots. Each node manages a Hash slot and executes its key every time a new request comes inCRC16(key)%16384If I do the remainder, I end up in a slot between 0 and 16,383.

However, every time a new node is added, the hash slot needs to be obtained from the original node, which involves the process of data migration. What if there is a user request during data migration? A current workaround is to have node1 and Node4 hold the same slot, but in different states. For example, node1’s slot state is set to migrating, while Node4’s state is importing. The request is first sent to Node1, and if there is data in node1, it is returned directly, and if not, it is sent to Ndoe4. As shown in the figure below.

At the same time, we noticed a single point of failure on node1, node2, etc. To increase availability, we used master-slave mode for each node. Data is first written to the master node, and then there are two methods. In the first method, the data is directly returned to the client, and then the data of the master node is synchronized to the slave node. The advantage of this method is that the response time is short, but the disadvantage is that there may be data inconsistency. If data fails before it can be synchronized to the slave node, the data will be lost. In mode 2, after data is written to the master node, data needs to be synchronized to the slave node and the result is returned to the client. This mode ensures strong data consistency, but users need to wait longer.

Cache breakdown problem

Each time a user attempts to access the cache, the user fails to hit the cache, causing each request to access the database. This is the cache breakdown problem, which causes the cache to fail and increases the system consumption. To address this problem, events such as Singles Day will store user information in the cache before the event starts.

Welcome to follow wechat public account: Mukeda, all articles will be synchronized on the public account.

How to implement caching

1. What format of data is stored in the cache?

2. How does the application access the cache

3. What if the cache space is used up?

4. Implement distributed storage

Cache breakdown problem

Related Posts

Backend programmers have to follow Nginx forward matching rules

Java inner classes and closures

With a multi-dimensional comparison of the five major distributed MQ message queues, mom no longer had to worry about my technology selection