In-depth understanding of caching architecture in distributed Systems (Part 2)

Following the previous chapter “Understanding the Cache Architecture in Distributed Systems (PART 1)”, this paper introduces the relevant theories of cache in large distributed systems, common cache components and application scenarios. This paper mainly introduces the common problems and solutions of cache architecture design, as well as industry cases.

Hierarchical caching of the schema request process

Common problems mainly include

Data consistency
The cache to penetrate
Cache avalanche
Cache high availability
Cache hotspots The following describes each of these problems and their solutions.

Data consistency

Because the cache is a copy of persistent data, data inconsistencies are inevitable. Dirty reads or data cannot be read. Data inconsistency is usually caused by network instability or node faults

There are three common scenarios for problems and their solutions:

Data consistency problem scenarios and solutions

The cache to penetrate

The cache generally exists in the form of Key and value. When a Key does not exist, the database will be queried. If the Key does not exist all the time, the database will be frequently requested, causing access pressure to the database.

Main solutions:

Data with empty results is also cached, and when there is data for this key, the cache is cleared
If the key does not exist, bloom filter is used to create a large Bitmap and filter the key through the Bitmap

Cache avalanche

Cache high availability

High cache availability depends on actual scenarios. Not all services require high cache availability. You need to design solutions based on specific services, for example, whether back-end databases are affected.

Main solutions:

Distributed: Massive cache of data
Replication: High availability of cached data nodes

Caching hot

Some hot data may concurrently access the same cache data, resulting in excessive pressure on the cache server.

Solution: Duplicate multiple cache copies to distribute requests to multiple cache servers, reducing the pressure on a single cache server caused by cache hotspots

The case mainly refers to Chen Bo’s technology sharing on Sina Weibo

Technical challenges

Feed cache architecture diagram

Architectural features

Sina Weibo applies SSD in distributed Cache scenarios. The traditional Redis/MC + Mysql mode is extended to Redis/MC + SSD Cache + Mysql mode. SSD Cache is used as L2 Cache, which firstly reduces the high MC/Redis cost. The problem of small capacity also solves the database access pressure caused by penetrating DB

Mainly in data architecture, performance, storage cost, servitization and other aspects of optimization and enhancement

Reference:

Learning architecture from scratch — Alibaba’s Li Yunhua

Java Core Technology lecture 36 — Oracle Xiaofeng Yang

Design practice of Microblog Cache Architecture — Bo Chen

The best application of caching in large distributed system — Hou Zhonghao

Cache, concurrent update pit? 58 Shen Jian –

Distributed cache design — crossoverJie

Reproduced description:

https://juejin.im/post/5b45cf3fe51d4519721b7974

– MORE | – MORE excellent articles

Send this article to the next person who asks you what the Java memory model is.
Distributed transaction solutions — Flexible transaction and Service patterns
That’s all you need to know about proxies in Java.
How Java code is compiled into machine instructions.

If you saw this, you enjoyed this article.

So please long press the QR code to follow Hollis

Forwarding moments is the biggest support for me.

In-depth understanding of caching architecture in distributed Systems (Part 2)

Data consistency

The cache to penetrate

Cache avalanche

Cache high availability

Caching hot

Technical challenges

Feed cache architecture diagram

Architectural features

Related Posts

How to learn Python systematically?

Blockchain Software Companies: The three mainstream technologies of blockchain

Why is Python growing so fast, and what are its advantages?