I’m sure many developers know about Redis, or at least have heard of it.

Redis is best known for providing a distributed caching mechanism for clustered applications. However, this is only one of its highlights.

Redis is a powerful, versatile in-memory database. Powerful because it’s very fast; Generic because it can handle caching, database-like features, session management, real-time analysis, event flow, and more…

However, when using Redis as a regular database, you have to pay attention to what’s in memory.

In this article, we’ll explore some of the most interesting nuances of the Redis caching pattern, using Node.js as the environment to run some benchmarks. Let’s get started!

Type of cache

You may have heard of it. With the rapid growth of systems built on relational databases, there is often a need to reduce some of the pressure on queries in order to achieve better performance.

Caching, as a physical implementation, is seen everywhere in system applications: from the database layer itself to the application service layer, and even as a remote distributed independent service (like Redis).

Before we explore further, let’s take a look at these types.

Type 1: Integrated cache for the database

Depending on the system design you follow, database integration can help your system improve some processing performance.

For example, if you use CQRS to drive the load to a NoSQL database when reading data and to a relational database when writing data, this might be a form of database-like integration for caching.

However, this is error-prone and requires a lot of manpower to run it, let alone maintain it.

Other databases, such as Aurora, provide built-in mechanisms to enable caching at the database level. In other words, the application layer and the client don’t need to know about the cache because the database architecture itself is responsible for everything: updating the logic when it arrives via internal replication.

Obviously, there are limitations in memory and data synchronization across cluster instances. However, given the right use-case scenario, this is a powerful integration to consider.

Type 2: local application cache

Programmatic caches are one of the most common types because they are simply memory structures that store data.

Each programming language has its own built-in or community-driven libraries that provide local caches quickly and easily.

The thing about it is it’s super fast. The data is in memory, so you can access it quickly, much faster than getting it through, say, a TCP request.

In addition, if you work in a distributed microservice environment, each node in the cluster has its own versioned data set, which is not shared among other nodes, and it is unlikely that all data will be lost when one node suddenly shuts down.

Type 3: Remote Caching (Redis)

In general, this type of cache is also called side cache, which means it exists as a service elsewhere than your application or database.

Because they work remotely, they have to perform well in extreme environments. This is why they can typically handle large data loads in milliseconds.

The choice of this type of cache should be discussed and considered. For example, do you have detailed information about your (or your provider’s) network latency? Can you scale your Redis cluster horizontally?

Because there is communication between the application and the outside world, a plan must be in place when processing data fails or becomes too slow. Developers often solve this problem by using a mix of local and remote caching strategies, which will give them a second barrier of protection in edge cases.

Mode of caching

Again, the way you implement caching can vary depending on your system requirements.

Let’s take a moment to examine the most common caching patterns.

Cache – value model

This is the most common caching pattern. As the name implies, it exists in different places in the system architecture (except for applications).

It is best if the application takes care of the choreography between the cache and the database. See the picture below:

Cache-aside Execution stream:

  1. The first step is to check the cache to see if you have the data you need. If successful, the application returns the information to the client without calling the database.
  2. If not, the application goes to the database for the latest information.
  3. Finally, once you have the latest version of the data, the application decides to write to the cache to make it aware.

This strategy has many benefits, such as the flexibility to handle completely different data types in the cache and database. The database needs to be carefully designed, because changes to the database can become very painful. However, caching gives you more freedom to use more flexible data structures.

Note that the strategy illustrated in the image above can cause trouble in the case of failed writes to the database and cache updates. In this case, it is important to have alternatives, such as a TTL setting, where the developer establishes a timeout to invalidate specific data in the cache. This way, when the cache data update fails, the application does not take too long to process the timed data.

The Write – Through mode

This pattern takes the opposite approach to the cache-saside pattern. Here, when any changes are detected, the application first writes to the cache and then goes to the database.

This is where it gets its name, because it goes through the cache layer before the final database write.

Write-through Execution flow:

Care must be taken when modeling this strategy, especially if database writes fail.

In this case, you can set up a retry policy to try to save the data to the database at all costs, or throw a transaction error that rolls back previous cached writes.

Here, just note the corresponding increase in the overall processing time for each stream.

Redis For Node.js

We’ll run a benchmark test on a Node.js application that exposes two API endpoints: one that handles cached data and one that doesn’t.

Our goal is to demonstrate how quickly you can configure your project to use cache-aside mode, while benchmarking both API endpoints to see how much Redis can improve REST API performance.

Environment to prepare

You can configure the Redis environment according to the official Quick Start:

wget http://download.redis.io/redis-stable.tar.gz
tar xvzf redis-stable.tar.gz
cd redis-stable
make
make install
Copy the code

Then, run the redis-server command to start the redis service:

Next, we create a project folder into which the CD goes.

Initialize and install dependencies:

npm init
npm install express redis axios
Copy the code

Finally, create the index.js file in the root directory and add the following code:

const axios = require("axios");
const express = require("express");
const redis = require("redis");

const app = express();
const redisClient = redis.createClient(6379); // Redis server started at port 6379
const MOCK_API = "https://jsonplaceholder.typicode.com/users/";

app.get("/users".(req, res) = > {
  const email = req.query.email;

  try {
    axios.get(`${MOCK_API}? email=${email}`).then(function (response) {
      const users = response.data;

      console.log("User successfully retrieved from the API");

      res.status(200).send(users);
    });
  } catch (err) {
    res.status(500).send({ error: err.message }); }}); app.get("/cached-users".(req, res) = > {
  const email = req.query.email;

  try {
    redisClient.get(email, (err, data) = > {
      if (err) {
        console.error(err);
        throw err;
      }

      if (data) {
        console.log("User successfully retrieved from Redis");

        res.status(200).send(JSON.parse(data));
      } else {
        axios.get(`${MOCK_API}? email=${email}`).then(function (response) {
          const users = response.data;
          redisClient.setex(email, 600.JSON.stringify(users));

          console.log("User successfully retrieved from the API");

          res.status(200).send(users); }); }}); }catch (err) {
    res.status(500).send({ error: err.message }); }});const PORT = process.env.PORT || 3000;
app.listen(PORT, () = > {
  console.log(`Server started at port: ${PORT}`);
});
Copy the code

This code uses an external mock API, JSONPlaceholder, which is useful for THE API FAKE. In this example, we will search for some virtual user data.

Note that port 6379 is the default for Redis, as you can see in the Redis service startup log above.

There are two apis. The first one basically gets the data from the user’s API, with no caching in between. In this way, you can see how much overloading subsequent HTTP calls can add to the overall performance of your application.

The second API always checks the given data in Redis. If the key for the data exists, we skip the mock API call and fetch the data directly from the cache, otherwise we continue to fetch the information and store the results in Redis.

Simple, right? Let’s move on to the benchmark code. First, let’s add a Node library to help us solve this problem. We will use the API-Benchmark tool because it is powerful enough to generate visual reports for the benchmark.

Install apI-benchmark by executing the following code:

npm install api-benchmark
Copy the code

Then, create another file benchmark.js in the root directory and add the following code:

var apiBenchmark = require("api-benchmark");
const fs = require("fs");

var services = {
  server1: "http://localhost:3000/"};var options = {
  minSamples: 100};var routeWithoutCache = { route1: "users? [email protected]" };
var routeWithCache = { route1: "cached-users? [email protected]" };

apiBenchmark.measure(
  services,
  routeWithoutCache,
  options,
  function (err, results) {
    apiBenchmark.getHtml(results, function (error, html) {
      fs.writeFile("no-cache-results.html", html, function (err) {
        if (err) return console.log(err); }); }); }); apiBenchmark.measure( services, routeWithCache, options,function (err, results) {
    apiBenchmark.getHtml(results, function (error, html) {
      fs.writeFile("cache-results.html", html, function (err) {
        if (err) return console.log(err); }); }); });Copy the code

We will execute two separate commands: one with no cache; The other uses redis cache.

This is a load of 100 requests per API service, which is not ideal for production-ready compaction, but is sufficient to demonstrate Redis’s capabilities. You can rerun the test later with more requests to see how the gap has increased.

Run the node index.js command, and then run the node benchmark.js command from another terminal.

If all goes well, you might see two new HTML files in the root directory of your project. When you open them in a Web browser, you might see something like the following:

  • Redis cache API benchmark results:

  • No cache API baseline results at this time:

Take a look at the statistics generated in the right panel. The uncached API ended the test in about 11 seconds, while the Redis cache API ended the test in about 0.9 seconds.

If you’re developing an application that receives a lot of requests per second, this is a great choice 😆.

conclusion

There are other caching patterns out there, but for the sake of simplicity, let’s focus on the most popular and powerful.

Modern applications focus on performance. This requirement becomes more stringent over time, as the complexity and number of requests to the application increases exponentially.

Redis is just one of many caching options. In fact, it is powerful, flexible and beloved by many companies and the technology community.

I recommend that you implement the second caching mode, write-through, compared to cache-aside in terms of performance.

You can see the sample code for this article on GitHub.

reference

  • Powerful Caching with Redis for Node.js Applications