This is why’s 81st original article


Have you ever met LRU in your interview?

LRU algorithm, short for Least Recently Used.

This translates to least recently used algorithm.

The idea is that if a piece of data has not been accessed in the recent past, it is less likely to be accessed in the future. Therefore, when the specified space is full of data, the data that has not been accessed for the longest time should be eliminated.

As you know from the description, it’s an elimination algorithm.

This algorithm is also a popular interview question.

Some interviewers even ask you to hand out an LRU algorithm.

In fact, I think, do not panic in this case, you just write one out in accordance with their own ideas.

You bet the interviewer won’t be able to whip up a bug-free LRU anytime soon. He just checks out a few key points, looks at your code style, and observes your approach to the problem.

But for the most part, the interview scenario looks like this:

Interviewer: Do you know the LRU algorithm?

Me: Yes, it translates as the least used algorithm recently. The idea is (previously said, not retold)……

Interviewer: Can you tell me about your methods to implement the LRU algorithm?

What is the question?

We all know the idea of this algorithm, please give a solution that can be landed according to this idea.

You don’t have to masturbate.


Scheme 1: arrays

If you have never been exposed to the LRU algorithm before, you just know the idea.

The first time to listen to ask you to give an implementation scheme, then the array scheme should be the easiest to think of.

Suppose we have a fixed length array. Each element in an array has a tag. This marker can be a timestamp or an incremented number.

Let’s say we use the numbers that are increasing.

Each time an element is added, the tags of the data already in the array are updated and incremented. When the array is full, the element with the largest number is removed.

Each time an element is accessed, the number of the element being accessed is set to 0.

Isn’t this an implementation of the LRU algorithm?

According to this idea, a copy of seven, eight, eight code out, should not be a problem?

But the downside of this approach is obvious: you need to constantly maintain the tags of the elements in the array.

So what do you think the time complexity is?

Yes, each operation is accompanied by an operation to iterate over the set of modification markers, so the time complexity is O(n).

But this one is definitely not going to satisfy the interviewer. Because it wasn’t the standard answer he had in mind.

Maybe he didn’t even think: How else can you offer this kind of solution?

But it doesn’t say it out loud, just softly: What’s the alternative?


Scheme two: linked lists

So you lock your head and think about it. Used least recently, the feeling is that an orderly structure is needed.

Every time I insert an element, I append it to the end of the array.

Every time I access an element, I move the accessed element to the end of the array.

So the most recently used must be at the back, the head is the least recently used.

When the specified length is used, the header element is removed.

What is this structure?

Isn’t it just a linked list?

To maintain an ordered singly linked list, nodes closer to the head of the list are accessed earlier.

When a new data is accessed, we traverse the list sequentially, starting at the head of the list.

If the data has already been cached in the list, we iterate to get the corresponding node of the data, delete it from its original position, and insert it into the end of the list.

What if this data is not in the cache linked list?

There are two cases:

  • If the cache is not full, a new node can be inserted directly into the end of the linked list to store the data.
  • If the cache is full at this point, the head node of the list is removed and a new node is inserted at the end of the list.

You see, this is not an implementation of the LRU algorithm?

According to this idea, a copy of the code is probably out of ten, the problem should not be big?

How is this scheme better than the array scheme?

I think it’s just ridiculously advanced, just looks a little bit more advanced than arrays.


From the point of view of time complexity, it is still O(n) because the linked list inserts and queries are traversed to see if the data exists.

Anyway, that’s not the answer the interviewer is looking for.

After you answer this question, the interviewer may say, “Can you give me a solution with O(1) time complexity for both queries and inserts?”

At this point, it’s all about talent.

To be fair, if I had never been exposed to the LRU algorithm before, I can confidently say:


Scheme three: bidirectional linked list + hash table.

If we want the time complexity of both query and insert to be O(1), then we need a data structure that meets the following three conditions:

  • 1. First of all, the data structure must be sequential in order to distinguish the recently used data from the data that has not been used for a long time. When the capacity is full, the element that has not been used for the longest time should be deleted.
  • 2. Quickly find whether a key exists in the data structure and return its value.
  • 3. Each time a key in the data structure is accessed, the element needs to be changed to the most recently used. That is, the data structure should support fast insertion and deletion of elements anywhere.

So, what kind of data structure would you say fits both of these criteria?

It’s fast. We can think of hash tables. But the hash table is out of order.

Ordered, we can think of linked lists, inserts and deletes are fast, but queries are slow.

So we have to combine hash and LinkedHashMap to grow into a new data structure called hash LinkedHashMap.


The structure looks something like this:


With the help of this structure, let’s analyze the above three conditions:

  • 1. If the default is to add elements from the end of the list each time, then obviously the elements closer to the end are the most recently used. The closer an element is to the head, the longer it has not been used.
  • 2. For a key, the hash table can be used to quickly locate the node in the linked list and obtain the corresponding value.
  • 3. Linked lists obviously support quick insertion and deletion anywhere, just by changing the pointer. However, a single linked list cannot quickly access elements in a certain position according to the index. All elements need to be traversed through the linked list. Therefore, with the help of hash table, we can quickly map to any linked list node through key, and then insert and delete them.

This is the right answer the interviewer is looking for about LRU.

But you think that’s the end of the answer?

The interviewer will ask you questions to make sure you know how well you’re doing.

So the question is: why do we use double linked lists here? Why not single linked lists?

Your in the mind a panic: I depend, this I also carry on the back. Off the top of my head.


So, don’t just memorize the answer, understand it.

Do you think we’re talking about deleting elements?

So what does a linked list need to delete an element besides its own pointer information?

Do you also need Pointers to precursor nodes?

So here we require the time complexity is O(1), so how can we directly get the pointer to the precursor node?

Does this have to be on a double-linked list?

Well, you see the answer in a wave of soul questioning.

A hash table already contains keys, so why store keys and values in linked lists instead of just values?

No, don’t panic. You analyze the wave.

Earlier we said that deleting a node in a linked list requires a double linked list to implement O(1).

Delete the nodes in the linked list, and then what?

Is there something else I forgot?

Is there another hash table I forgot to operate on?

Does the hash table also have to be deleted?

What does it take to delete a hash table?

Do you need a key to delete a value?

Where does this key come from?

Does it have to come from nodes in a linked list?

If the nodes in the linked list, only value, do not have keys, then we cannot delete the keys of the hash table. Wouldn’t that be the end of the shit?

Another wave of soul questioning.

So, do you know the answer now?

On the other hand, some people may directly answer LinkedHashMap.

I guess you could say that if you really don’t know.

However, this is probably the last thing an interviewer wants to hear.

He’ll think you’re cutting corners.

However, in real development, we still use LinkedHashMap when we really want to use it.

Is it hard for you to say this?


Okay, you think this is the end of the interview?

Application of LRU in MySQL

Interviewer: The young man just answered LRU well. Why don’t you tell me about the application of LRU in MySQL?

LRU is used in MySQL as a Buffer Pool.

Its purpose is to reduce disk IO.

I won’t go into the details of what buffer pools do.

You know it’s a contiguous chunk of memory, 128MB by default, and you can change it.

This contiguous chunk of memory is divided into pages with a default size of 16KB.

If it is a pool, it must be full.

You have to remove some pages, right?

Which begs the question: which pages to remove?

As I said, it’s to reduce disk IO. So you should weed out pages that haven’t been accessed in a long time.

Haven’t used for a long time, isn’t this LRU’s home court?

But the LRU algorithm is not simply used in MySQL.

Because MySQL has a read-ahead function. The intent is good, but it is possible to preread pages that do not need to be used.

These pages are also placed at the head of the list, which is not large enough to make the tail elements obsolete.

Oh, no. That’s a lower hit rate. Cool.


Another scenario is full table scan SQL, which may directly change the entire buffer page in the buffer pool, affecting the hit ratio of other queries in the buffer pool.

So how do you handle this scenario?

The LRU linked list is divided into two sections, one containing hot data and the other containing cold data.

Stop it. I can’t say another word.

And then there’s another article. That’s enough.

If you don’t know, I suggest you learn.

Application of LRU in Redis

Since it is the memory elimination algorithm, then we commonly used in Redis must also have the corresponding implementation.

Redis has several memory flushing strategies:

  • Noenviction: default policy. Instead of continuing the write request (which can be handled by DEL requests), the read request can continue. This ensures that data will not be lost, but the online business will not continue.
  • Volatile – LRU: Selects the least recently used data from a set with expiration time. Keys that do not have an expiration date are not eliminated.
  • Volatile -random: Randomly selects data from a set whose expiration time has been set to obsolete.
  • Volatile – TTL: Selects data to be obsolete from a set with an expiration time.
  • Allkeys-lru: Unlike volatile lru, the key object to be eliminated by this strategy is the entire set of keys.
  • Allkeys-random: Randomly select data from all data sets for elimination.

After Redis 4.0, two elimination strategies were added.

  • Volatile – LFU: LFU is used to eliminate keys with expiration time
  • Allkeys-lfu: adopts the lfu elimination algorithm for allkeys

About the LRU algorithm in Redis, the official website says:

https://github.com/redis/redis-doc/blob/master/topics/lru-cache.md


The LRU algorithm in Redis is not a strict LRU algorithm.

Redis will attempt to perform an approximate LRU algorithm by sampling a small number of keys and then retrieving the most suitable of the sampled keys, that is, the one that has not been accessed for the oldest time.

However, since Redis3.0, the performance of the algorithm has been improved, making it more similar to the real LRU algorithm. The trick is to maintain a pool of reclaim candidate keys.

An important point about Redis LRU algorithm is that you can adjust the accuracy of the algorithm by modifying the following parameter configuration.

maxmemory-samples 5

And I’ve marked the most important sentence:

The reason why Redis does not use a true LRU implementation is because it costs more memory.

The reason Redis doesn’t use the real LRU algorithm is because it consumes more memory.

Then the official website gives a comparison diagram of random LRU algorithm and strict LRU algorithm:


The chart’s official website says:


You can see in the picture three different bands formed by three different dots:

  • Light gray bands are objects that have been recycled (eliminated by the LRU algorithm)
  • Gray bands are objects that have not been collected
  • The green band is the newly added object

Due to Redis 3.0, the LRU algorithm is improved and the elimination pool is added.

So you can see that Redis 3.0 performs better than Redis 2.8 using the same 5 sample points.

At the same time, it can be seen that in Redis 3.0, the sample size of 10 is used, and the approximate value is very close to the theoretical performance.

Another interview question came to mind.

There are 3000W data in the database, but only 100W data in Redis. How to ensure that Redis stores all hot data?

What did you say was the point of the question?

It’s all about elimination strategy, guys, but in a subtle way.

We specify the elimination strategy as allkeys-LRU or volatile- lRU, and then calculate the approximate memory usage of 100W data. Based on this calculation, we limit the memory usage of Redis.

Get things done.

If you find something wrong, you can raise it backstage and I can fix it.

Thank you for reading, I insist on original, very welcome and thank you for your attention.

I am Why, a literary creator delayed by code. I am not a big shot, but I like sharing. I am a nice man in Sichuan who is warm and interesting.

And welcome to follow me.