The official website of Grape City, grape city for developers to provide professional development tools, solutions and services, enabling developers.
What is caching at the beginning of this article? Caching is the preparation of some important data in advance from a list of data.
Without caching, the throughput of the system depends on storing the slowest data, so an important optimization for keeping your application high performance is caching.
Two important tasks in a Web application are caching of file and video bloBs and fast access to page templates. In NodeJS, the asynchronous functionality operating system delay will decide when to provide services for other clients, even though the operating system has its own file cache mechanism, but the same server with multiple web applications running at the same time, and one of the application are a large number of video data, cache content to other applications are likely to frequent failure, The efficiency of the program will be greatly reduced.
The LRU algorithm for application resources can effectively solve this problem, so that the application is not affected by other application caches in the same server. Considering that storing the slowest data determines the throughput of the system, the existence of LRU cache can improve system performance by a factor of 2 to 100. At the same time, the asynchronous LRU hides any latency that the cache misses.
Let’s take a look at the implementation.
The code shown
- First build a file to construct the LRU object module:
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000){
let me = this;
let maxWait = elementLifeTimeMs;
let size = parseInt(cacheSize,10);
let mapping = {};
let mappingInFlightMiss = {};
let buf = [];
for(let i=0; i<size; i++) {let rnd = Math.random();
mapping[rnd] = i;
buf.push({data:"".visited:false.key:rnd, time:0.locked:false});
}
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2.10);
let loadData = callbackBackingStoreLoad;
this.get = function(key,callbackPrm){
let callback = callbackPrm;
if(key in mappingInFlightMiss)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
return;
}
if(key in mapping)
{
// RAM speed data
if((Date.now() - buf[mapping[key]].time) > maxWait)
{
if(buf[mapping[key]].locked)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
}
else
{
delete mapping[key];
me.get(key,function(newData){ callback(newData); }); }}else
{
buf[mapping[key]].visited=true;
buf[mapping[key]].time = Date.now(); callback(buf[mapping[key]].data); }}else
{
// datastore loading + cache eviction
let ctrFound = -1;
while(ctrFound===-1)
{
if(! buf[ctr].locked && buf[ctr].visited) { buf[ctr].visited=false;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(! buf[ctrEvict].locked && ! buf[ctrEvict].visited) {// evict
buf[ctrEvict].locked = true;
ctrFound = ctrEvict;
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
mappingInFlightMiss[key]=true;
let f = function(res){
delete mapping[buf[ctrFound].key];
buf[ctrFound] =
{data: res, visited:false.key:key, time:Date.now(), locked:false};
mapping[key] = ctrFound;
callback(buf[ctrFound].data);
deletemappingInFlightMiss[key]; }; loadData(key,f); }}; };exports.Lru = Lru;
Copy the code
- Example file cache:
let Lru = require("./lrucache.js").Lru;
let fs = require("fs");
let path = require("path");
let fileCache = new Lru(500.async function(key,callback){
// cache-miss data-load algorithm
fs.readFile(path.join(__dirname,key),function(err,data){
if(err) {
callback({stat:404.data:JSON.stringify(err)});
}
else
{
callback({stat:200.data:data}); }}); },1000 /* cache element lifetime */);
Copy the code
Use the LRU constructor to get parameters (cache size, cache missed keywords and callbacks, cache element life cycle) to construct the CLOCK cache.
- Asynchronous cache missed callbacks work as follows:
1. Some get() keys cannot be found in the cache. 2. Run this callback: In a callback, the callback of the callback function is returned to the LRU cache at the end of the callback when the important calculation completes asynchronously 4. The only implementation of this dependency get() is to access the same key data again from RAM:
fileCache.get("./test.js".function(dat){
httpResponse.writeHead(dat.stat);
httpResponse.end(dat.data);
});
Copy the code
The resulting data has another callback, so it can run asynchronously
The working principle of
- Now most LRU’s working process always has a “mapping” object from the key to the cache slot, in terms of the number of cache slots to achieve O (1) key search time complexity. But it’s much simpler with JavaScript:
Mapping object:
let mapping = {};
Copy the code
Find a (string/integer) key in the map:
if(key in mapping)
{
// key found, get data from RAM
}
Copy the code
Efficient and simple
- As long as the mapping corresponds to a cache slot, data can be fetched directly from it:
buf[mapping[key]].visited=true;
buf[mapping[key]].time = Date.now();
callback(buf[mapping[key]].data);
Copy the code
Visited used to tell the CLOCK pointer (CTR and ctrEvict) to save the slot and avoid it being expelled. The time field is used to manage the life cycle of the slot. The time field is updated whenever an access hit is made to the cache, leaving it in the cache.
The user uses the callback function to provide the get() function with the data to retrieve the cache slot.
- Before you want to retrieve data directly from the mapping slot, you need to look at its life cycle. If the life cycle is over, you need to delete the mapping and retry with the same key to make the cache lost:
if((Date.now() - buf[mapping[key]].time) > maxWait)
{
delete mapping[key];
me.get(key,function(newData){
callback(newData);
});
}
Copy the code
After the mapping is deleted, other asynchronous access does not affect its internal state
- If the key is not found in the mapping object, run the LRU evocation logic to find the target:
let ctrFound = -1; while(ctrFound===-1) { if(! buf[ctr].locked && buf[ctr].visited) { buf[ctr].visited=false; } ctr++; if(ctr >= size) { ctr=0; } if(! buf[ctrEvict].locked && ! buf[ctrEvict].visited) { // evict buf[ctrEvict].locked = true; ctrFound = ctrEvict; } ctrEvict++; if(ctrEvict >= size) { ctrEvict=0; }}Copy the code
The first “if” block checks the state of the slot to which the “second chance” pointer (CTR) points and marks it as inaccessible if it is unlocked and accessed, rather than banishing it.
The third “If” block checks the state of the slot pointed to by the ctrEvict pointer, marks the slot as “locked” If it is unlocked and not accessed, prevents asynchronous access to the GET () method, finds the ejector slot, and the loop ends.
The comparison shows that the initial phase difference between CTR and ctrEvict is 50% :
let ctr = 0; Let ctrEvict = parseInt (cacheSize / 2, 10);Copy the code
And they increase equally in the “while” loop. This means that the two loops follow the other, checking each other out. The more cache slots, the better for the target slot search. For each key, each key stays at least N / 2 clockwise motions before being saved from ejection.
- After the target slot is found, the mapping is removed to prevent asynchronous collisions and the mapping is recreated after the datastore is loaded:
mappingInFlightMiss[key]=true;
let f = function(res){
delete mapping[buf[ctrFound].key];
buf[ctrFound] = {data: res, visited:false, key:key, time:Date.now(), locked:false};
mapping[key] = ctrFound;
callback(buf[ctrFound].data);
delete mappingInFlightMiss[key];
};
loadData(key,f);
Copy the code
Since the user-provided loadData store can be asynchronously loaded, the cache operation can contain up to N cache misses and hide up to N cache miss delays. Hiding latency is an important factor affecting throughput, especially in Web applications. A deadlock occurs once more than N asynchronous cache misses/accesses occur in an application, so a cache with 100 slots can asynchronously service up to 100 users, even limiting it to a value lower than N (M) and counting over multiple (K) passes (where M x K = total accesses).
We all know that cache hits are the speed of RAM, but because cache misses can be hidden, the overall performance looks O (1) time complexity for both hits and misses. There may be multiple iterations of clock hands per access when the number of slots is small, but it approaches O (1) if the number of slots is increased.
In this loadData callback, setting the locked fields of the new slot data to false enables the slot to be used for other asynchronous access.
- If there is a hit and the slot found ends its life cycle and is locked, the access operation setTimeout delays the 0 time parameter to the end of the JavaScript message queue. The probability of the lock operation (cache-miss) ending before setTimeout is 100%. In terms of time complexity, it still counts as O (1) with a large delay, but it is hidden behind the delay of the lock operation delay.
if(buf[mapping[key]].locked) { setTimeout(function(){ me.get(key,function(newData){ callback(newData); }); }, 0); }Copy the code
- Finally, if a key is in an ongoing cache that is not mapped, it is deferred to the end of the message queue with setTimeout:
if(key in mappingInFlightMiss) { setTimeout(function(){ me.get(key,function(newData){ callback(newData); }); }, 0); return; }Copy the code
In this way, duplication of data can be avoided.
benchmarking
- The asynchronous cache did not hit the benchmark
"use strict"; // number of asynchronous accessors(1000 here) need to be equal to or less than // cache size(1000 here) or it makes dead-lock let Lru = require("./lrucache.js").Lru; let cache = new Lru(1000, async function(key,callback){ // cache-miss data-load algorithm setTimeout(function(){ callback(key+" processed"); }, 1000); },1000 /* cache element lifetime */); let ctr = 0; let t1 = Date.now(); for(let i=0; i<1000; i++) { cache.get(i,function(data){ console.log("data:"+data+" key:"+i); if(i.toString()+" processed" ! == data) { console.log("error: wrong key-data mapping."); } if(++ctr === 1000) { console.log("benchmark: "+(Date.now()-t1)+" miliseconds"); }}); }Copy the code
To avoid deadlocks, the LRU size can be set to 1000, or for can only allow 1000 iterations of the loop.
Output:
benchmark: 1127 miliseconds
Copy the code
Since there is a 1000ms delay for each cache miss, it takes 15 minutes to load 1000 elements synchronously, but overlapping cache misses are faster. This is particularly useful in I/O heavy workloads, such as streaming data from HDDS or networks.
- Cache hit ratio baseline
10% hit rate: Key generation: random, 1000 slots with possible 10,000 different values
"use strict"; // number of asynchronous accessors(1000 here) need to be equal to or less than // cache size(1000 here) or it makes dead-lock let Lru = require("./lrucache.js").Lru; let cacheMiss = 0; let cache = new Lru(1000, async function(key,callback){ cacheMiss++; // cache-miss data-load algorithm setTimeout(function(){ callback(key+" processed"); }, 100); },100000000 /* cache element lifetime */); let ctr = 0; let t1 = Date.now(); let asynchronity = 500; let benchRepeat = 100; let access = 0; function test() { ctr = 0; for(let i=0; i<asynchronity; I++) {let key = parseInt(math.random ()*10000,10); // 10% hit ratio cache.get(key.toString(),function(data){ access++; if(key.toString()+" processed" ! == data) { console.log("error: wrong key-data mapping."); } if(++ctr === asynchronity) { console.log("benchmark: "+(Date.now()-t1)+" miliseconds"); console.log("cache hit: "+(access - cacheMiss)); console.log("cache miss: "+(cacheMiss)); console.log("cache hit ratio: "+((access - cacheMiss)/access)); if(benchRepeat>0) { benchRepeat--; test(); }}}); } } test();Copy the code
Results:
Benchmark: 10498 Miliseconds cache Hit: 6151 Cache Miss: 44349 Cache hit Ratio: 0.1218019801980198Copy the code
Since the benchmark was done in 100 steps, the latency for each cache loss was 100 milliseconds, resulting in a time of 10 seconds (close to 100 x 100 milliseconds). The hit rate was close to 10%.
50% hit ratio test
Let the key = parseInt (Math. The random () * 2000, 10); // 50% hit ratio Result: benchmark: 10418 miliseconds cache hit: 27541 cache miss: 22959 cache hit ratio: 0.5453663366336634Copy the code
Hit ratio test
Let the key = parseInt (Math. The random () * 1010, 10); // 99% hit ratio Result: benchmark: 10199 miliseconds cache hit: 49156 cache miss: 1344 cache hit ratio: 0.9733861386138614Copy the code
The result is a randomness of the bond with a ratio of 0.9733
% hit ratio test
Let the key = parseInt (Math. The random () * 999, 10); // 100% hit ratioCopy the code
Results:
Benchmark: 1463 Miliseconds Cache Hit: 49501 Cache Miss: 999 Cache hit Ratio: 0.9802178217821782Copy the code
After the first step of benchmarking (there is no escape from a cache miss), everything comes from RAM and the total latency is greatly reduced.
Conclusion:
This paper introduces the implementation of LRU algorithm cache in NodeJS in detail, hoping to provide new ideas for us to better improve system performance in the development.
Read more:
Vue is a set of progressive framework for building the user interface, unlike other JS framework, Vue applications can be designed to step by step a bottom-up, due to its core library only focus on the view layer, therefore the Vue was easier, and easy to with a third party library or both project integration, when its and modern tool chain or a combination of all kinds of support libraries, Vue can also provide drivers for complex single-page applications. SpreadJS is a pure front end table control based on HTML5 that can be embedded in applications in a native way and combined with a front-end and back-end technology framework. Integrating SpreadJS with Vue enables Excel like spreadsheet functionality within the Vue framework, including support for more than 450 calculation formulas, online import and export of Excel documents, Pivottables and visual analysis, enabling applications with extremely high processing performance and response times. Read about SpreadJS integration within the Vue framework.