Byte 2, unexpectedly let me write an LFU cache strategy algorithm, meng!

LRU stands for “Least Recently Used”, the Least Recently Used strategy, which determines the most Recently Used time and prioritises the data furthest away from the current. As an algorithm that changes the order of linked list according to the access time to achieve cache elimination, LRU is one of the elimination algorithms adopted by Redis.

Redis also has a caching policy called LFU. What is LFU? Let’s analyze LFU in this issue:

What is the LFU

LFU is the most Frequently Used policy. The data that is Used the Least Frequently in a period of time is eliminated.

Least-used (LFU) is a caching algorithm for managing computer memory. When the cache is full and needs more space, the system will clear the memory at the lowest memory block usage rate. The simplest way to use the LFU algorithm is to assign a counter to each block loaded into the cache.

Each time the block is referenced, the counter increases by one. When the cache reaches capacity and a new block of memory is waiting to be inserted, the system searches for the block with the lowest counter and removes it from the cache.

img

Insert elements at the beginning of the list, count each insert, reorder the list by number of insertions, or by insertion time if the number of insertions is the same, and select the obsolete data from the end of the list

LRU implementation

Defining a Node

Node mainly contains key and value, because the main realization idea of LFU is to compare The Times of access. If the time of nodes needs to be compared when the number of times is the same, the earlier one is added, the faster it will be eliminated. Therefore, we need to add time and count attributes to Node nodes. It is used to record the access time and times of nodes respectively.

In other versions, there is a new inner class to record the count and time of key, but I think it is not convenient, and I have to maintain a map independently, which costs a little bit too much.

The comparable interface overrides the Compare method, which is used for sorting. We compare the number of times a node has been accessed, and compare the number of times a node has been accessed for the same number of times. This is used to select obsolete keys by comparing keys in the sorting method

/** * node */
public static class Node implements Comparable<Node>{ / / key
    
    Object key; / / value
    Object value; /** ** Access time */
    long time; /** ** number of visits */
    intcount; Public Node(Object key, Object Value, long time,int count) { 
        this.key = key; 
        this.value = value; 
        this.time = time; this.count = count;    
    } 
    
    public Object getKey() { 
        return key;
    } 
    
    public void setKey(Object key) { 
        this.key = key;
    } 
    
    public Object getValue() { 
        return value;
    } 
    
    public void setValue(Object value) { 
        this.value = value;
    } 
    
    public long getTime() { 
        return time;
    } 
    
    public void setTime(long time) { 
        this.time = time;
    } 
    
    public int getCount() { 
        return count;
    } 
    
    public void setCount(int count) { 
        this.count = count;
    }

    @Override public int compareTo(Node o) { 
        intCompare = Integer.compare(this.count, o.count);// Compare times in equal numbers
        if (compare==0) {returnLong.com pare said. This time, the o.t (ime); }returncompare; }}Copy the code

img

Define the LFU class

Defines the LFU class, which uses generics, declares K and V, total capacity, and a Map(caches) to maintain all nodes.

Size is passed in the constructor and a LinkedHashMap is created. The main reason for using a LinkedHashMap is to maintain the order of keys

Public class LFU<K, V> {/** * Total capacity */  
    private int capacity; 
    
    /** * 所有的node节点 */Private Map<K, Node> caches;/** * constructor */  
    public LFU(int size) { 
      this.capacity = size;  
      caches = newLinkedHashMap < K, Node > (size); }}Copy the code

Add elements

The logic of adding an element is to obtain the node from the cache according to the key. If it cannot obtain the node, it is proved to be the newly added element. Then compare the node with the preset capacity, and find the node with the smallest count (if the count is the same, select the node with the longest time), and remove the node.

If the node is within the predetermined size, the new node is created. Note that the system.currentTimemillis () method is not used here, because the millisecond granularity does not distinguish between insertion times. Only system.nanotime () can distinguish key insertion times, with the default count set to 1.

If it is available, it overwrites the old value with the new value, counts +1, and sets the key’s time to the current nanosecond. Finally, we need to sort. Here we can see that the logic for inserting elements is mainly to add elements into the cache, update the time and count of elements ~

/** * add element * @param key * @param value */Public void put(K key, V value) {Node Node = caches. Get (key);// If new element
    if (node == null) { 
        // If the element capacity is exceeded
        if (caches.size() >= capacity) { 
            // Remove the key with the smallest count
            K leastKey = removeLeastCount();  
            caches.remove(leastKey);  
        } 
        // Create a node
        node = newNode(key, value, system.nanotime (),1); Caches. Put (key, node); }else { 
        // The existing element overwrites the old value
        node.value = value;  
        node.setCount(node.getCount()+1);  
        node.setTime(System.nanoTime());  
    }  
    sort();  
}  
Copy the code

Each put or GET element needs to be sorted. The main meaning of sorting is to rearrange the key order according to the cout and time of the key. The logic here is to create a cache map as a list, and then reorganize the whole map according to the value of the Node.

The order of placing the values of keys and values into the original cache is reorganized. The difference between LRU and LRU is that it uses Java’s collection API, and LRU’s sorting is node movement.

In LFU, however, the implementation is more complicated, because when you put, you have to compare not only the base but also the time. If you do not use Java API, you can maintain a new node frequency linked list, each time the key stored in the node frequency linked list to move the pointer, which can also indirectly achieve sorting ~

/** ** sort */Private void sort() {List< map.entry <K, Node>> List =newArrayList<>(caches.entrySet()); The Collections. The sort (list,newComparator< map.entry <K, Node>>() {@override publicintCompare (map. Entry<K, Node> o1, map. Entry<K, Node> o2) {compare(map. Entry<K, Node> o1, map. Entry<K, Node> o2) {returno2.getValue().compareTo(o1.getValue()); }}); caches.clear();for(map.entry <K, Node> kNodeEntry: list) {caches. Put (knodeentry.getkey (), knodeentry.getValue ()); }}Copy the code

Here the collections. min method is called, and the element with the smallest count and longest time is removed directly from the cache by comparing the key’s compare method

/** ** removes the smallest number of statistics or time */  
private K removeLeastCount() {  
    Collection<Node> values = caches.values();  
    Node min = Collections.min(values); 
    return (K)min.getKey();  
 }  
Copy the code

Access to elements

The element is first fetched from the cache map, otherwise null is returned. After the element is fetched, the node needs to be updated, counting +1 and the time to refresh the node.

According to the principle of LFU, after the node is obtained at the current time, it will temporarily become a hot node, but its COUT count may be less than the count of a node, so it cannot be moved directly to the top of the linked list at this time, and it needs to be sorted to reorganize its position ~

/** get the element * @param key * @return */  
public V get(K key){  

  Node node = caches.get(key); 
  if(node! =null){ node.setCount(node.getCount()+1);  
      node.setTime(System.nanoTime());  
      sort(); 
      return (V)node.value;  
  } 
  return null;  
}
Copy the code

test

First declare an LRU, then the default maximum size is 5, put into A, B, C, D, E, F6, then find the element with the smallest count and the shortest time, will eliminate A(because count is 1).

Remember to get the B element twice, so the count of the B element is 3, and the time is updated. B is going to move to the top, and then in getC, C is going to have count=2, so it’s going to be up to date, so it’s still going to be after B because its count is still less than B, and then getF, F is going to have count=2.

Time and because it will be the latest, so under the same count as C, F element update distance recently now (time), so the list will be mobile, will be in front of the C F, again put a C, C at this time of the count = 3, at the same time for the latest, so at the moment is consistent with the count and B, C, they are time, C significantly update, So C will come before B, and the final order should be :C->B->F->E->D

Public static void main(String[] args) {LFU<Integer, String> lruCache =new LFU<>(5);  
    lruCache.put(1."A");  
    lruCache.put(2."B");  
    lruCache.put(3."C");  
    lruCache.put(4."D");  
    lruCache.put(5."E");  
    lruCache.put(6."F");  
    lruCache.get(2);  
    lruCache.get(2);  
    lruCache.get(3);  
    lruCache.get(6);  
    // Put node 3 again
    lruCache.put(3."C"); Final Map<Integer, Node> caches = (Map<Integer, Node>) lruCache. Caches;for(map.entry <Integer, Node> nodeEntry: caches. EntrySet ()) {system.out.println(nodeEntry.getValue().value); }}Copy the code

Running results:

img

Differences between LRU and LFU and disadvantages of LFU

LRU and LFU have different emphases. LRU is mainly reflected in the use time of elements, while LFU is mainly reflected in the use frequency of elements.

The drawback of LFU is that certain caches are accessed so frequently over a short period of time that they are immediately promoted to hot data, guaranteed not to be obsolete, and thus remain in system memory.

In fact, this part of data is only briefly accessed with high frequency, and then will not be accessed for a long time. Transient high frequency access will accelerate the reference frequency of this part of data, and some newly added caches are easy to be deleted quickly, because their reference frequency is very low.

conclusion

This article gives a brief introduction to LFU, details how to implement LFU in Java, and makes a simple comparison with LRU.

For a caching strategy. LFU has its own unique usage scenarios, how to realize LFU and use the characteristics of LFU to realize some functions of the development process is what we need to think about.

In practice, LRU is used more often than LFU, but knowledge of this algorithm is a must for programmers.

From: cnblogs.com/wyq178/p/11790015.html