Following the previous chapter “Understanding the Cache Architecture in Distributed Systems (PART 1)”, this paper introduces the relevant theories of cache in large distributed systems, common cache components and application scenarios. This paper mainly introduces the common problems and solutions of cache architecture design, as well as industry cases.
Common problems mainly include
-
Data consistency
-
The cache to penetrate
-
Cache avalanche
-
Cache high availability
-
Cache hotspots The following describes each of these problems and their solutions.
Data consistency
Because the cache is a copy of persistent data, data inconsistencies are inevitable. Dirty reads or data cannot be read. Data inconsistency is usually caused by network instability or node faults
There are three common scenarios for problems and their solutions:
The cache to penetrate
The cache generally exists in the form of Key and value. When a Key does not exist, the database will be queried. If the Key does not exist all the time, the database will be frequently requested, causing access pressure to the database.
Main solutions:
-
Data with empty results is also cached, and when there is data for this key, the cache is cleared
-
If the key does not exist, bloom filter is used to create a large Bitmap and filter the key through the Bitmap
Cache avalanche
Cache high availability
High cache availability depends on actual scenarios. Not all services require high cache availability. You need to design solutions based on specific services, for example, whether back-end databases are affected.
Main solutions:
-
Distributed: Massive cache of data
-
Replication: High availability of cached data nodes
Caching hot
Some hot data may concurrently access the same cache data, resulting in excessive pressure on the cache server.
Solution: Duplicate multiple cache copies to distribute requests to multiple cache servers, reducing the pressure on a single cache server caused by cache hotspots
The case mainly refers to Chen Bo’s technology sharing on Sina Weibo
Technical challenges
Feed cache architecture diagram
Architectural features
Sina Weibo applies SSD in distributed Cache scenarios. The traditional Redis/MC + Mysql mode is extended to Redis/MC + SSD Cache + Mysql mode. SSD Cache is used as L2 Cache, which firstly reduces the high MC/Redis cost. The problem of small capacity also solves the database access pressure caused by penetrating DB
Mainly in data architecture, performance, storage cost, servitization and other aspects of optimization and enhancement
Reference:
Learning architecture from scratch — Alibaba’s Li Yunhua
Java Core Technology lecture 36 — Oracle Xiaofeng Yang
Design practice of Microblog Cache Architecture — Bo Chen
The best application of caching in large distributed system — Hou Zhonghao
Cache, concurrent update pit? 58 Shen Jian –
Distributed cache design — crossoverJie
Reproduced description:
https://juej
– MORE | – MORE excellent articles
-
Send this article to the next person who asks you what the Java memory model is.
-
Distributed transaction solutions — Flexible transaction and Service patterns
-
That’s all you need to know about proxies in Java.
-
How Java code is compiled into machine instructions.
If you saw this, you enjoyed this article.
So please long press the QR code to follow Hollis
Forwarding moments is the biggest support for me.