In the summary article of Tencent News Grab Gold Talent activity Node isomorphic rendering scheme, we have an overall understanding of the use of isomorphic rendering scheme in our project. As I said at the end of my last post:

The difficulty of applied technology is not to overcome technical problems, but to constantly combine their own product experience, find existing experience problems, and constantly use better technical solutions to optimize user experience, contributing to the development of the whole product.

We chose the React isomorphic rendering scheme according to the experience effect of the product, and it is inevitable to ensure the availability and reliability of the current scheme. For example, how many people can our service support access at the same time, whether we can still ensure the normal access of users when the number of users increases, and how to ensure the normal operation of CPU and memory, so as not to be occupied and unable to release. Therefore, here we should include some data of our project:

  1. The number of daily visits and peak visits of the project, namely, the number of concurrent users;
  2. What is the maximum QPS that our standalone service can support?
  3. What is the service response time and the failure rate of pages and interfaces in high concurrency?
  4. CPU and memory usage, and whether CPU usage is insufficient or memory leakage occurs.

That’s all we need to know before we go live. The importance of stress testing is highlighted. We conduct sufficient tests before launching, which can enable us to master the running performance of programs and servers, and roughly apply for how many machines and so on.

1. Initial stress test

Here we use the Autocannon to pressure the project. Note that we have not yet carried out any optimization measures, but to expose the problems first, and then targeted optimization.

60 concurrency per second for 100 seconds:

autocannon -c 60 -d 100
Copy the code

Data after pressure measurement:

As you can see from the image, at 60 concurrent requests per second, the server can process about 266 requests on average, but there are still 23 requests that time out, and the response time is ok, 99% of the requests complete in 1817ms ms. As far as these data are concerned, the data processing capacity is not ideal, and we still have a lot of room for improvement.

2. Solutions

We need to take some action against the unsatisfactory data measured above.

2.1 Memory Management

When we write pure front-end, we almost pay little attention to the use of memory, after all, in the process of front-end development, the garbage collection mechanism of memory is relatively perfect, and the life cycle of front-end pages is relatively short. If you really want to pay special attention to it, it is also in the early IE browser, JS and DOM interaction process may produce memory leakage. And if it does leak, it will only affect users on the current terminal, not other users.

The server side is different, all users will access the current running code, as long as the program has a little memory leak, in thousands of visits, will cause memory accumulation, garbage can not be recycled, and eventually cause serious memory leak, and lead to program crash. In order to prevent memory leaks, we mainly focus on three aspects of memory management:

  1. V8 engine garbage collection mechanism;
  2. Causes of memory leaks;
  3. How to detect memory leaks;

Node extends the primary use of JavaScript to the server side and requires a different level of detail than the browser side, requiring more careful planning for each resource. In general, memory in Node can’t be used as much as you want, but it’s not bad at all.

2.1.1 Garbage collection mechanism of V8 engine

In V8, there are two main memory generations: the new generation and the old generation. The objects in the new generation are the objects with a short lifetime, and the objects in the old generation are the objects with a long lifetime or resident memory.

By default, the maximum memory of the new generation is 32 MB on 64-bit systems and 16 MB on 32-bit systems. V8’s maximum memory is 1464 MB on 64-bit systems and 732 MB on 32-bit systems.

Why are there two generations? It’s for optimal GC algorithm. The new-generation GC algorithm scinsane is fast, but not fit for large data volumes. Old-generation needles use Mark-sweep & Mark-Compact algorithms, which are suitable for large amounts of data, but slower. Optimize GC speed using algorithms that are more appropriate for both the old and new generations respectively.

2.1.2 Causes of Memory Leaks

There are many cases of memory leaks, such as memory caching, queues, repetitive event listening, etc.

In the case of caching, it is common to use a variable to cache data, and then fill the data with no expiration time, as in the following simple example:

let cached = new Map(a); server.get(The '*', (req, res) => {
    if (cached.has(req.url)) {
        return cached.get(req.url);
    }
    const html = app.render(req, res);
    cached.set(req.url, html);
    res.send(html);
});
Copy the code

In addition, closures are another example of this. The downside of this use of memory is that there is no expiration policy available, and it just keeps piling up data and eventually causing a memory leak. It is better to use third-party caching mechanisms such as Redis, memcached, etc., which have good expiration and obsolescence strategies.

At the same time, there are also some queue processing, such as some log write operations, when massive data needs to be written, it will cause queue accumulation. At this point, we set a timeout policy and a reject policy for the queue to allow some operations to be freed as quickly as possible.

Another is the repeated listening of events. For example, if you listen to the same event repeatedly and forget to remove it (removeListener), memory leaks will occur. This can easily happen when adding events to a reusable object, so event repeat listeners may receive warnings like:

Warning: Possible EventEmitter Memory leak detected. 11 /question Listeners added. The Use of emitter. setMaxListeners() to increase limit

2.1.3 Means of investigation

As can be seen from the memory monitoring chart, when the number of users remains basically unchanged, the memory has been rising slowly, indicating that we have produced a memory leak and the used memory has not been released.

We can use node-heapdump or –inspect to do this:

node --inspect server.js
Copy the code

Then open the Chrome link chrome://inspect to view memory usage.

HandleRequestTimeout () methods are generated all the time, and there are countless callbacks in each handle method, so the resource cannot be freed.

The axios code used by locating the view is:

if (config.timeout) {
    timer = setTimeout(function handleRequestTimeout() {
        req.abort();
        reject(createError('timeout of ' + config.timeout + 'ms exceeded', config, 'ECONNABORTED', req)); }}Copy the code

The code here appears to be problem-free, a typical time-out solution in front-end processing.

In Nodejs, the IO link will block timer processing, so the setTimeout will not be triggered on time, so it will return after 10 seconds.

It seems that the problem has been resolved, the heavy traffic and the blocked connection caused the request to pile up, the server could not handle it, and the CPU could not go down.

By locating and viewing the axios source code:

if (config.timeout) {
    // Sometime, the response will be very slow, and does not respond, the connect event will be block by event loop system.
    // And timer callback will be fired, and abort() will be invoked before connection, then get "socket hang up" and code ECONNRESET.
    // At this time, if we have a large number of request, nodejs will hang up some socket on background. and the number will up and up.
    // And then these socket which be hang up will devoring CPU little by little.
    // ClientRequest.setTimeout will be fired on the specify milliseconds, and can make sure that abort() will be fired after connect.
    req.setTimeout(config.timeout, function handleRequestTimeout() {
        req.abort();
        reject(
            createError(
                'timeout of ' + config.timeout + 'ms exceeded',
                config,
                'ECONNABORTED',
                req
            )
        );
    });
}
Copy the code

Well, the version I used before is earlier, which is different from the code I used locally, indicating that it has been updated. Then check the change history of this file on September 16th:

This is where we need to update Axios to the latest version. And after a lot of local tests, found that under high load CPU and memory are within the normal range.

2.2 the cache

Caching is really a great way to optimize performance. However, there are many types of caching, and we should choose the appropriate caching strategy according to the actual situation of the project. Here we use a three-tier caching strategy.

In Nginx, you can use proxy_cache to set the path to be cached and for how long, and you can also enable proxy_cache_lock.

When proxy_cache_lock is enabled, when multiple clients request a file (or MISS) that does not exist in the cache, only the first of these requests is allowed to be sent to the server. Other requests get the files in the cache after the first request is satisfied. If proxy_cache_lock is not enabled, all requests that cannot find a file in the cache are communicated directly to the server.

However, this field should be enabled very carefully, when the traffic is too large, it will cause a backlog of requests, must wait for the first request to complete, before processing subsequent requests.

proxy_cache_path /data/cached keys_zone=answer:16m levels=1:2 inactive=60m; server { location / { proxy_cache answer; proxy_cache_valid 1m; }}Copy the code

At the business level, we can enable Redis caching to cache entire pages, parts of pages, interfaces, and so on, when penetrating nginx caching. The use of third-party caching is a feature we discussed in a previous article: multiple processes can be shared, while reducing the need for the project itself to deal with the cache elimination algorithm.

When the first two caches fail, we enter our Node service layer. The layer 2 cache mechanism can implement different cache policies and cache granularity. Services need to select the cache suitable for their own services based on their own scenarios.

Effect of 3.

What is the performance of our project at this point?

autocanon -c 1000 -d 100
Copy the code

As you can see from the picture, 99% of the requests were completed in 182ms, and the average number of requests processed per second was around 15,707. Compared to the initial 200 or so requests, our performance improved by more than 60 times.

THE END

People feel lonely, try all kinds of ways to dispatch, ultimately can not escape from loneliness. Loneliness is the curse of nature to those who live in groups. Loneliness is the only way to face loneliness.

One Hundred Years of Solitude

I am a little front-end development engineer to Tencent, long press to identify the QR code to pay attention to, and we study together, discuss ▼