A high-concurrency system cannot live without Cache, and it is inevitable to improve system throughput and stability by using more local caches. ** The biggest difficulty is to solve the real-time and consistency problems of distributed local Cache data. ** Otherwise, the local Cache cannot be more widely used for frequently changing data.
There is no perfect solution, only a more appropriate solution. This paper will explain iQiyi TV background distributed real-time local Cache practice scheme in detail, to provide a reference for solving high concurrency problems.
The background,
At present, the Internet system is mostly read and write less system, in the face of read and write less system, we will be split into read system, write system, improve system stability and throughput, today we talk about Cache is mainly read oriented system.
Iqiyi has a large amount of copyright Metadata, which needs to be returned as basic data in different business scenarios. In order to improve the speed of business iteration and reduce system coupling, ** we split the assembly of these basic data into independent services. This service provides an encapsulation of both generic and personalization logic and calls this microservice wherever the underlying data is used. ** Each piece of data ranges from a few kilobytes to tens of kilobytes, and the number keeps growing. This service serves as the base service for each sub-business service.
How to solve the high concurrency problem caused by this service for centralized Cache (supporting millions of QPS at peak times) : ** For example, centralized Cache Intranet bandwidth loss, Cache network failure timeout scenario, etc. ** The following provides one of many solutions.
Second, ideas & programs
First, compare the advantages and disadvantages of local Cache and centralized Cache:
1. The local Cache
Its advantages are:
(1) Hotspot cache, each instance expansion is equivalent to expansion of a hotspot database;
(2) High hit rate;
(3) Expiration strategy;
(4) Fast business logic speed, low machine loss;
(5) Strong risk resistance.
Its disadvantages include:
(1) Generally passive cache, poor real-time;
(2) Limited storage capacity, equipped with 2GB~4GB, enough to meet the current hotspot data in most scenarios.
2. Centralized Cache
Its advantages are:
(1) Convenient real-time Cache update;
(2) Strong cache consistency.
Its disadvantages are as follows:
(1) The cluster is too large and too dependent;
(2) If the concurrency is high, I/OS are too heavy.
(3) Vulnerable to network jitter between the application machine and the cache machine;
(4) When the traffic volume of hotspot keys is too large, the bandwidth is easily full and multiple Cache clusters are required to solve the problem.
Compared with the above advantages and disadvantages, ** most people use local hotspot Cache. ** local storage of 4GB can generally meet common service hotspot data, but the real-time performance of local Cache is poor. How to solve the real-time performance? The solution is as follows:
3. Solutions
** A unified message mechanism is used to trigger real-time update of the local Cache. At the same time, a message filtering mechanism is provided and the service side processes the logic by itself. In this way, personalized update of the local Cache can be implemented. ** Scheme is as follows:
4. Programme description
** (1) Management background: ** Manages all application instances and Cache policies that use the local Cache;
(2) Data change: data change source;
** (3) Message bus: the collecting and distributing center of ** messages;
** (4) Business Filter: ** The business party can process some messages by itself;
** (5) Monitoring statistics: ** Adopts iQiyi’s unified log collection system, which can be used for statistical analysis of hotspot data to provide data support for other hotspot schemes. Unified monitoring is adopted to monitor indicators such as Cache hit of each instance
Three, extension,
If the local Cache hit ratio of a cluster is lower than the acceptable threshold (for example, 70%), local Cache storage cannot be expanded due to memory limitation. You can divide the cluster into light logical data fragments to improve the hit ratio.
4. Summary of effects
(1) Effectively reduce the risk of cluster avalanche;
(2) Solve the problem of high concurrent reading;
(3) Reduce network penetration of hotspot data and reduce the burden of centralized Cache.