An overview of the

LRU is the Least Recently Used. Its theoretical basis is “recently used data in the future will be a period of time is still used, haven’t the data used in the large probability in the future for a long time still not be used because the quite fit the business scenario, and it can solve many problems in actual development, so we often through the thought of the LRU cache, It is also commonly referred to as the LRU caching mechanism.

The principle of

When implementing LRU, we need to pay attention to its read and write performance. An ideal LRU should be able to read or update a piece of data in O(1) time, that is to say, the time complexity of reading and writing is O(1).

At this point, it’s easy to think of using hash tables to access data based on its keys at O(1) speed. However, the cache cannot be updated as fast as O(1) because you need to determine which piece of data has the earliest access time, which requires traversing all the caches to find.

Therefore, we need a data structure that can be both sorted by access time and accessed randomly in constant time.

This can be done with hash + bidirectional linked lists:

  • The hash table guarantees O(1) time to access data through the key.
  • Bidirectionally linked lists traverse each data in order of access time.

The reason for choosing a bidirectional list rather than a single list is that the structure of the list can be modified from any intermediate node without traversing from the beginning node.

The illustration example

Suppose we design an LRU cache of size 4.

1. Add four elements first

The top row is the key of the hash table, and we add k1, k2, k3, and k4. The bottom row is the node structure of the hash table, which contains (key, value) pairs, which we use the two-way linked list. N1, n2, N3, n4. The latest addition is at the head of the list, indicating recent use.

When we add the fourth element, it looks like this:

2. Add k5

Put k5 in the hash table, and then put N5 in the head of the list, and since the size of the hash table is larger than the capacity of 4, we need to delete an element from the tail, which represents the longest unused element. At the same time, remove k1 from the hash table corresponding to tail N1.

3. Visit the k3

K3 recently accessed, removes N3 from its current position and inserts it into the head of the list.

Conclusion:

  • Each time an element is added to the list, it is added from the header
  • Every time you delete an element, you delete it from the tail
  • Delete the corresponding key from the hash table
  • Revisited elements need to be moved to the head of the list

Leetcode

Leetcode has the LRU Cache classic algorithm question, which is not difficult and is also a question I often ask when I am an interviewer. This problem is a simple version of the LRU implementation and can be used as an introductory exercise.

Leetcode.com/problems/lr…

146. LRU caching mechanism

Using your knowledge of data structures, design and implement an LRU (least recently used) caching mechanism. Implement the LRUCache class:

  • LRUCache(int capacity) The LRU cache is initialized using a positive integer as capacity
  • Int get(int key) Returns the value of the key if it exists in the cache, otherwise -1 is returned.
  • Void put(int key, int value) If the keyword already exists, change its data value. If the keyword does not exist, the set of keyword – values is inserted. When the cache reaches its maximum capacity, it should delete the oldest unused data values before writing new data to make room for new data values.

Advanced: Can you do both in O(1) time?

Answer: c + +

#include<iostream>
#include<unordered_map>
using namespace std;

class LRUCache {
public:
    struct Node {
        Node(int key, int val) {
            key_ = key;
            val_ = val;
            left_ = right_ = nullptr;
        }
        int key_, val_;
        Node* left_, *right_;
    };
    
    LRUCache(int capacity) {
        capacity_ = capacity;
        head_ = new Node(- 1.- 1);
        tail_ = new Node(- 1.- 1);
        head_->right_ = tail_;
        tail_->left_ = head_;
    }
    
    ~LRUCache() {
        Node *node = head_;
        while(node ! =nullptr) {
            Node *next = node->right_;
            deletenode; node = next; }}int get(int key) {
        if(hash_.count(key) == 0) return - 1;
        Node *node = hash_[key];
        if(node->left_ ! = head_) {unlink(node);
            insertToHead(node);
        }
        return node->val_;
        
    }
    
    void put(int key, int value) {
        if (hash_.count(key) == 0) {
            if (hash_.size() >= capacity_) {
                Node *node = tail_->left_;
                unlink(node);
                hash_.erase(node->key_);
                delete node;
            }
            Node *node = new Node(key, value);
            hash_[key] = node;
            insertToHead(node);
        } else {
            Node *node = hash_[key];
            node->val_ = value;
            if(node->left_ ! = head_) {unlink(node);
                insertToHead(node); }}}void insertToHead(Node *node) {
        node->right_ = head_->right_;
        head_->right_ = node;
        node->left_ = head_;
        node->right_->left_ = node;
    }
    
    void unlink(Node *node) {
        node->left_->right_ = node->right_;
        node->right_->left_ = node->left_;
    }
    
private:
    int capacity_;
    unordered_map<int, Node*> hash_;
    Node *head_, *tail_;
};
Copy the code

Open Source project applications

Here I describe the use of LRU in various common open source projects. I will summarize it in as short a language as possible, but try to include the key information.

The LRU elimination strategy is used in Redis

When Redis is used as a cache, memory space consumption needs to be considered in some scenarios. Redis has many elimination strategies, two of which are related to LRU:

Allkeys-lru: lru elimination for allkeys-lru elimination for only keys whose expiration time is set

As we know, THE LRU algorithm needs a bidirectional linked list to record the most recently accessed order of data, but for memory saving, Redis LRU algorithm is not a complete implementation.

Instead of recycling the oldest unreachable key, Redis tries to run an LRU-like algorithm that samples a small number of keys and then recycles the oldest unreachable key among them. By adjusting maxmemory-samples, the accuracy of the algorithm can be adjusted. In Redis3.0, there is also a new elimination pool, which is essentially a large root heap. New and randomly generated keys are added to the elimination pool and the oldest keys are eliminated.

About Redis elimination strategy or LRU source code analysis, can also be a separate blog analysis, here is not limited to do a detailed analysis.

MySQL InnoDB engine cache pool LRU algorithm

InnoDB cache pool: Simply put, an area of memory where InnoDB accesses data and index information stored on disk. Buffer pools have two functions: one is to improve the efficiency of bulk read operations, and the other is to improve the efficiency of cache management. When MySQL’s InnoDB engine loads data pages from disk into the cache pool, it usually loads the data pages adjacent to the target data pages into the cache pool –MySQL prefetch mechanism. Why does MySQL have a read-ahead mechanism? It’s really about improving performance.

On the other hand, it may happen that the data pages that are preread are almost never accessed, but due to the nature of LRU, these data pages will be near the head node of the linked list after being preread, and some of the data pages that are themselves frequently accessed may be eliminated.

So MySQL adopts the LRU strategy based on hot and cold isolation. The LRU chain is split into two parts: one part stores hot data and one part stores cold data. When data is first loaded into the cache pool, it is placed in the head of the cold chain. If it is accessed again after Innodb_old_blocks_time (default 1s), the cache page is moved to the head of the hot chain.

Actual service usage

In actual business, LRU cache is mainly used for hotspot data to reduce the pressure of back-end database or Redis cache.

Electric daqo and promote

During the promotion period, kill sales promotion or popular shops

We use LRU caching in real business, mainly to solve the problem of hot data access.

Some of our data is stored in MySQL, and then Redis’s cluster is used as a cache. Usually due to the high hit ratio, MySQL read QPS is relatively low, everything is very smooth.

However, during some promotional activities (Double 11), the page view of some popular stores was very high, leading to some hot data, among which the CPU utilization of several Redis nodes was 100% directly. So for hot data, we make it through the LRU cache to greatly relieve the pressure of Redis and MySQL.

There are two ways to distinguish hot data:

  • We can set some rules on the server to distinguish hot data. If the requested data meets certain rules, it is considered hot data.
  • You can also add a flag to an API request that tells the server to treat the requested data as hotspot data, such as a request for a seckill page that automatically carries this flag.

When the request arrives, it is checked to see if it matches the hotspot flag, if it is the hotspot data, the LRU cache is queried, and if it matches, it returns. If not, read to Redis and write to LRU. If not, read to MySQL and write to Redis. This layered approach can greatly reduce the pressure on Redis and MySQL.

Weibo hot search

A similar scenario is the hot search on Weibo. Weibo hot search is also a kind of instant hot data with large traffic. Often hearing celebrity gossip leads to the direct breakdown of weibo, which is the risk that hot data is not properly processed. Generally, requests coming from hot pages or events marked as hot by the server are treated as hot data and cached by LRU.

conclusion

LRU is a useful algorithm in production business that pursues space utilization, using trade-offs to strip memory of the least recently accessed data. Is a very useful algorithm, but also look for a job interview often one of the algorithms, we have to learn.

< Full text >

LRU cache mechanism, you want to know all here (wX public number)

Welcome to follow my wechat official account: [Miannong in Singapore], have more good technology to share.