5~10 questions are updated daily
For the output
- What is the difference between KOA2 and Express?
- What do you know about processes and threads?
- What middleware have you used with KOA2? How does it work?
- Node native API error handling? The way
- How do processes in Node communicate with each other?
- How to ensure the stability of node startup service?
- Is there any optimization for node interface forwarding?
- How does a Node gently degrade and restart?
Has been finished
- What are the scenarios and advantages and disadvantages of Node?
- How do I install the NPM module? What happens when you enter the NPM install command and press Enter?
- How to optimize Node performance?
- Node is better for I/O intensive or CPU intensive tasks, and why?
- How is Node performance monitored?
- How do you understand the node middle layer to do request merge forwarding?
What are the scenarios and advantages and disadvantages of Node?
Node.js is a JavaScript runtime based on the Chrome V8 engine
Node.js uses an event-driven, non-blocking I/O model that makes it lightweight, efficient, makes concurrent programming easier, and is suitable for I/ O-intensive applications where network programming is the main focus.
Top down: At the bottom are the various libraries that Node.js relies on, and Chrome V8 interprets and executes JS code. Libuv provides event loops, thread pool management, asynchronous network I/O, file system I/O and other capabilities, responsible for the distribution and execution of I/O tasks, C-ARES (DNS resolution), Crypto, HTTP, Zlib (compression) and so on, providing access to the underlying engineering of the system. Such as network, encryption, compression and so on.
In the middle is the bridge layer, the bridge between JS and C/C++. Bingdings transfers the C/C++ library interface exposed by the underlying Node.js trusted dependency library to the JS environment. Addons are used for C/C++ extensions.
The topmost application layer can call various node.js apis
Applicable scenario
1, RESTful API
This is the ideal scenario for Node.js, which can handle tens of thousands of connections without much logic of its own. You just need to request the API and organize the data back. It’s essentially just looking up some values from some database and putting them together into a response. Because the response is small in text, the inbound requests are small in text, so the traffic is not high, and a machine can handle the API requirements of even the busiest companies.
2. Unify the UI layer of Web applications
In the current MVC architecture, in a sense, Web development has two UI layers, one in the browser that we eventually see, and one on the serve side that generates and concatenates pages. There is no discussion of whether this architecture is good or bad, but there is another practice, service-oriented architecture, that does a better separation of dependencies at the front and back ends. If all of the critical business logic is encapsulated as REST calls, that means you only need to worry about building concrete applications with those REST interfaces at the top. Back-end programmers don’t care how the data gets passed from one page to another, nor do they care whether user data updates are retrieved asynchronously through Ajax or through page refreshes.
3. Application of a large number of Ajax requests
For personalization applications, each user sees a different page, the cache fails, ajax requests need to be made when the page loads, and nodeJS can respond to a large number of concurrent requests
In summary, NodeJS is suitable for high concurrency, I/O intensive, and low business logic scenarios:
User form collection examination system chat room Web forum live graphicsCopy the code
NodeJS can implement almost anything, nodeJS is suitable for high concurrency, I/O intensive, small business logic scenarios, we only consider the point of suitability for it to do.
advantages
-
JS runtime environment, so that JS can also develop back-end programs
-
Event driven. Through a single thread to maintain the event cycle queue, no multi-thread resource occupation and context switch, efficient scalability, can make full use of system resources.
-
Non-blocking I/O, can handle high concurrency
-
Single thread (main thread is single thread)
-
scalable
-
It can be applied on PC Web, PC client Nw. js/electron, mobile cordova, HTML5, React-Native, WEEX, and hardware ruff. IO.
-
Various package modules on NPM
-
Cooperate with the front end to do interface forwarding merge request reduction JSON size can independently control routing (do SSR isomorphism) so that the front end can have more initiative to independently deploy online.
disadvantages
-
CPU intensive applications CPU intensive applications bring the following challenges to Node: Because the JS thread is single-threaded, the CPU time slice cannot be released if the CPU is running for a long time, and subsequent I/ OS cannot be initiated. The solution is as follows: Decompose large computing tasks into several small tasks, so that the computing can be released timely and can not block the initiation of I/O calls.
-
Not suitable for large memory applications, limited by V8 memory management mechanism (64-bit about 1.4g, 32-bit about 0.7G)
-
It is not suitable for mass synchronization applications
-
Only single-core cpus are supported and cpus cannot be fully utilized
-
Reliability is low
Once one part of the code crashes, the whole system crashes because of a single process, a single thread
Solution: Nginx reverse proxy, load balancing, open multiple processes, bind multiple ports; Open multiple processes to listen on the same port, using the Cluster module
-
Open source components vary in quality, update quickly, and are incompatible downward
-
Debug is inconvenient, error does not have stack trace
How do I install the NPM module? What happens when you enter the NPM install command and press Enter?
The concept of NPM
NPM is the package management tool for the JS world and is the default package management tool for the Node. JS platform. NPM allows you to install, share, distribute code, and manage project dependencies.
For the convenience of installing functional modules during development, NPM package manager was created.
In the Git Clone project, there is no node_modules folder in the project file, which stores the dependent modules used in our project development. This folder can be hundreds of megabytes in size, and if you put it on Github, it will be very slow for someone else to clone it, so you need a package.json dependency configuration file to solve this problem.
When everyone downloads the project and builds the environment, simply go to the project directory and directly NPM install. NPM will use this file to find the required libraries, that is, dependencies.
Some modules are public and open source and can be found and downloaded directly. Some modules are self-developed and need to specify the path where the package is placed
After NPM install, we found a new node_modules folder in our project, where all of our dependencies can be found.
NPM install execution process
After entering the NPM install command and hitting Enter, the following stages occur:
- Execute the project’s own preinstall
If the current NPM project defines a preinstall hook, it will be executed
- Identify the layer 1 dependency module
The first thing you need to do is identify the first dependencies in the project, which are the modules specified directly in the dependencies and devDependencies properties (assuming that the NPM install parameter is not added at this point)
The project itself is the root node of the whole dependency tree, and each first-level dependency module is a sub-tree below the root node. NPM will start multiple processes from each first-level dependency module to gradually find nodes at deeper levels
- Acquisition module
Acquiring a module is a recursive process divided into the following steps:
Getting module information: Before downloading a module, you need to know its version, because package.json is often a Semantic version. At this point, if there is information about the module in the version description file, you can get it directly; if not, you can get it from the warehouse. If the version of a package in package.json is ^1.1.0, NPM will go to the repository to get the version that matches the 1.1.0 form
Get the module entity: In the previous step you will get the package address (Resolved field) of the module. NPM will use this address to check the local cache and grab it if it is in the cache or download it from the repository.
Find the module dependencies: go back to the first step if there are dependencies, and stop if there are none
- The module is flat
The previous step obtained a complete dependency tree, which may contain a large number of duplicate modules. For example, module A depends on LoDash, and module B also depends on LoDash. Prior to NPM3, the dependency tree was strictly installed because of module redundancy.
Starting with NPM3, a dedupe process is added by default. It iterates through all nodes, placing modules one by one under the root node, the first layer of node_modules. When duplicate modules are found, they are discarded
There is a definition needed for duplicate modules, which is the same module name and the semantically compatible version. Each semantically defined version corresponds to a set of versioning ranges. If the versioning ranges of two modules overlap, then a compatible version can be obtained without having to have the same version number, which enables more redundant modules to be removed during the dedupe process.
- Install the module
This step will update node_modules in the project and execute the lifecycle functions in the module (in preinstall, install, postinstall order)
- Execute the project’s own life cycle
The current NPM project will execute hooks if they are defined (install, postinstall, prepublish, prepare). The last step is to generate or update the version description file, and the NPM install process is complete
NPM installs modules
- Run the NPM install command
- Query whether the specified module already exists in the node_modules directory
If it does not exist, do not reinstall it. If it does not exist, download the compressed package from the compressed address of registry query module and store it in the root directory. NPM decompress the compressed package to node_modules in the current directoryCopy the code
Package – lock. Json
When developing system applications, it is recommended that the package-lock.json file be submitted to the code repository to ensure that all team developers and CI sessions have the same dependencies installed when NPM install can be executed.
When developing an NPM package, your NPM package is dependent on other repositories. Because of the flat installation mechanism, if you lock the version of the dependency package, your dependency package cannot share the semantic version range with other dependencies, which can cause unnecessary redundancy. So we should not publish package-lock.json files (NPM does not publish package-lock.json files by default)
How to optimize Node performance?
1. Optimization of web application layer
There are several ways to improve performance for Web applications:
1. Separation of static and static
In normal Web applications, Node can also implement static file services through middleware, but node’s ability to handle static files is not outstanding. Bootstrap static files such as images, fonts, style sheets, and multimedia to a professional static file server and let Node handle only dynamic requests. This process can be handled using NGINx or a specialized CDN. After the separation of dynamic and static requests, the server can focus on dynamic services. Professional CDN will keep static files as close as possible to users, and at the same time can have a more accurate and efficient caching mechanism. With static file requests separated, using different or multiple domain names for dynamic requests also eliminates unnecessary cookie transfers and browser limitations on the number of download threads.
2. Enable caching
There are really only two ways to improve performance. One is to speed up the service and the other is to avoid unnecessary calculations. The improved performance of the former will eventually be a bottleneck in the face of massive traffic, but the latter will be more profitable in the presence of more visits. Avoid unnecessary computations. The most common application scenario is caching. While synchronous I/O can waste a lot of time while the CPU waits, with the help of caching, it can reduce the time wasted by synchronous I/O. But whether it’s synchronous or asynchronous I/O, avoiding unnecessary calculations can be a significant performance boost when followed well.
3. Multi-process architecture
The multi-process architecture can not only make full use of multi-core CPU, but also establish a mechanism to make Node process more robust to ensure the continuous service of Web applications.
4. Read and write separation
In addition to static separation, another relatively important separation is read and write separation, which is mainly for the database, as the database, the speed of reading is much higher than that of writing data. In order to ensure data consistency, some databases will lock the table, which will affect the speed of reading. To improve performance, some systems separate read and write data from databases. In this way, data reads are not affected by data writes, reducing the impact on performance.
2. Optimization of code layer
- Use the latest version of NodeJS
Just using the latest version of NodeJS can improve performance, which comes from two aspects:
- V8 version update
- Nodejs internal code update optimization
- Use fast-jSON-stringify to speed up JSON serialization
When serializing JSON, we need to identify a large number of field types. For strings, we need to add “to both sides. For arrays, we need to iterate through the array, serialize each object, separate it with a comma, and then add [] to both sides.
If you already know the type of each field from the Schema in advance, you don’t need to traverse and identify the field type, and you can serialize the corresponding field directly, which greatly reduces the computation overhead. This is how fast-JSON-stringify works
Node is better for I/O intensive or CPU intensive tasks, and why?
Node is better suited for HANDLING I/O intensive tasks.
Because Node’s I/O intensive tasks can be called asynchronously, the use of event loop processing ability, resource occupation is very few, and the event loop ability avoids multi-threaded call, in the call aspect is a single thread, internal processing is actually multi-threaded.
In addition, because Javascript is single-threaded, Node is not suitable for processing CPU-intensive tasks. Cpu-intensive tasks will result in CPU time slices that cannot be released and subsequent I/ OS cannot be initiated, resulting in blocking. However, it can use the characteristics of multi-process to complete the processing of some CPU-intensive tasks. However, because Javascript does not support multi-threading, its processing ability in this respect will be weaker than that of other multi-threaded languages (such as Java and Go).
How to monitor and optimize node performance?
Monitor the classification
- One is the
Business logic
The monitoring of - One is the
hardware
The monitoring of
Performance monitoring
Node performance monitoring is divided into the following aspects:
Log monitoring
By monitoring the changes of exception logs, the types and quantities of new exceptions can be reflected in the monitoring logs to realize PV and UV monitoring. Through PV/UV monitoring, users’ usage habits can be known and access peak can be predicted
The response time
Response time is also a point to monitor. Once a subsystem of the system is abnormal or performance bottlenecks will lead to a long response time of the system. Response times can be monitored on reverse proxies such as Nginx or by the application itself generating access logs
Process monitoring
Monitoring logs and response times are good at monitoring the state of the system, but they assume that the system is running, so monitoring the process is a more critical task than the first two. Monitoring processes generally check the number of application processes running in the operating system. For example, for web applications with multi-process architecture, you need to check the number of working processes and raise an alarm if the number falls below the low estimate
Disk monitoring
Disk monitoring is used to monitor disk usage. Due to frequent logging, disk space is gradually used up. Set an upper limit on disk usage, which can cause problems for the system. If disk usage exceeds the alarm threshold, the server administrator should delog or clean up the disk
Memory monitoring
For Node, once a memory leak occurs, it is not easy to detect. Monitor server memory usage. If the memory is only going up and not down, you must have a memory leak. Normal memory usage should be up and down, rising when the traffic is high and falling when the traffic is down. Monitoring memory exception times is also a good way to prevent system exceptions. If a memory exception suddenly occurs, you can also track which recent code changes caused the problem
CPU Usage Monitoring (CPU usage)
Server CPU usage monitoring is also essential. CPU usage can be divided into user mode, kernel mode, and IOWait mode. If the CPU usage in user mode is high, the application on the server requires a large AMOUNT of CPU overhead. If the kernel CPU usage is high, the server needs to spend a lot of time on process scheduling or system calls. IOWait usage indicates that the CPU waits for disk I/O operations. In the CPU usage, the user mode is less than 70%, the kernel mode is less than 35%, and the overall CPU usage is less than 70%, which are within the normal range. Monitoring CPU usage helps you analyze the status of your application in real business. Reasonable setting of monitoring threshold can give a good warning
CPU Load Monitoring (CPU load)
CPU Load Is also called average CPU load. It is used to describe the current busy degree of the operating system, and is simply understood as the average number of CPU tasks in use and waiting to use CPU in a unit of time. It has three metrics, namely 1-minute average load, 5-minute average load, and 15-minute average load. If the CPU load is too high, the number of processes is too high, which may be reflected in node when the process module repeatedly starts new processes. Monitoring this value can prevent accidents
I/O load
I/O load refers mainly to disk I/O. It reflects the read and write status on the disk. For node applications, which are mainly network oriented services, it is unlikely that the I/O load is too high. Most OF the I/O pressure comes from the database. Regardless of whether the Node process works with the same server as a database or other I/ O-intensive application, we should monitor this value for unexpected situations
Network monitoring
Although the priority of network traffic monitoring is not as high as that of the preceding items, you still need to monitor the traffic and set a traffic upper limit. Even if your app suddenly becomes popular with users, you can get a sense of how effective your site’s advertising is when traffic skyrockets. Once the traffic exceeds the warning level, the developer should find out why the traffic is increasing. For normal growth, you should evaluate whether to add hardware to serve more users. The two main indicators of network traffic monitoring are inbound traffic and outbound traffic
Application status Monitoring
In addition to these mandatory metrics, the application should also provide a mechanism to feedback its own state information, and external monitoring will continuously call the application’s feedback interface to check its health.
DNS monitoring
DNS is the foundation of network applications. Most external service products depend on domain names. It is not uncommon for DNS failures to cause extensive product impact. The DNS service is usually stable and easy to ignore, but when it fails, it can be unprecedented. To ensure product stability, you also need to monitor domain name and DNS status.
How to monitor
Performance monitoring usually requires tools
Easy-monitor 2.0 is adopted here, which is a lightweight Node.js project kernel performance monitoring + analysis tool. In the default mode, only require once in the project entry file, without changing any business code to enable kernel-level performance monitoring analysis
The usage method is as follows:
Import in your project entry file as follows, of course passing in your project name:
const easyMonitor = require('easy-monitor');
easyMonitor('Name of your project');
Copy the code
Open your browser and visit http://localhost:12333 to see the process interface
How to optimize
There are several ways to optimize Node performance:
- Use the latest version of Node.js
- Use streams correctly
- Code level optimization
- Memory management optimization
Use the latest version of Node.js
The performance improvements in each release come from two main aspects:
- V8 version update
- Node.js internal code update optimization
Use streams correctly
In Node, many objects are streamed, and a large file can be sent as a stream without having to read it into memory
const http = require('http');
const fs = require('fs');
// bad
http.createServer(function (req, res) {
fs.readFile(__dirname + '/data.txt'.function (err, data) {
res.end(data);
});
});
// good
http.createServer(function (req, res) {
const stream = fs.createReadStream(__dirname + '/data.txt');
stream.pipe(res);
});
Copy the code
Code level optimization
Merge query, multiple queries merge once, reduce the number of database queries
// bad
for user_id in userIds
let account = user_account.findOne(user_id)
// good
const user_account_map = {} // Note that this object consumes a lot of memory.
user_account.find(user_id in user_ids).forEach(account){
user_account_map[account.user_id] = account
}
for user_id in userIds
var account = user_account_map[user_id]
Copy the code
Memory management optimization
In V8, there are two main memory generations: the new generation and the old generation:
- Cenozoic: Objects live for a short time. Newborn objects or objects that have been garbage collected only once
- Old generation: The object lives for a long time. Objects that have undergone one or more garbage collections
If the memory space of the new generation is insufficient, it is directly allocated to the old generation
By reducing memory footprint, you can improve server performance. If there is a memory leak, a large number of objects will be stored in the old generation, and the server performance will be greatly reduced
As follows:
const buffer = fs.readFileSync(__dirname + '/source/index.htm');
app.use(
mount('/'.async (ctx) => {
ctx.status = 200;
ctx.type = 'html';
ctx.body = buffer;
leak.push(fs.readFileSync(__dirname + '/source/index.htm')); }));const leak = [];
Copy the code
Leaky memory is very large, and should be avoided. Reducing memory usage is one of the ways to improve service performance
The best way to save memory is to use pooling, which stores frequently used, reusable objects and reduces creation and destruction
For example, if you have an image request interface, every time you make a request, you need to use the class. It is not appropriate to have to new these classes every time, because they are frequently created and destroyed in the event of a large number of requests, causing memory jitter
Using the object pooling mechanism, objects that frequently need to be created and destroyed are kept in a pool of objects. Each time the object is used, an object that is free from the object pool is taken and initialized to improve the performance of the framework
How do you understand the node middle layer to do request merge forwarding?
1. What is the middle layer?
Front-end -> NodeJS -> Back-end -> NodeJS -> Data processing -> front-end
Such a process, the advantage of this process is that when the business logic is too much, or the business requirements are constantly changing, the front end does not need to change the business logic as much, low coupling to the back end. The front end is display and render. The backend retrieves and stores data. The middle layer processes the data structures and returns them to the front end as renderable data structures.
Nodejs plays the role of the middle layer, that is, to do the corresponding processing or rendering of the page according to the different requests of the client. During the processing, the data can be simply processed by the underlying Java side for real data persistence or data update, or it can be retrieved from the underlying data for simple processing and returned to the client.
We usually divide the Web realm into client and server, namely the front end and the back end, which includes gateways, static resources, interfaces, caches, databases, and so on. The middle layer, on the other hand, is a layer removed from the back end to deal with the parts that are more closely connected with the client in business, such as page rendering (SSR), data aggregation, interface forwarding and so on.
In terms of SSR, rendering the page well on the server side can speed up the user’s first screen loading speed, avoid blank screen when requesting, and help the website to do SEO, his benefits are relatively easy to understand
2. What the middle tier can do
- Proxies: In a development environment, we can use proxies to solve the most common cross-domain problems; In an online environment, we can use proxies to forward requests to multiple servers.
- Caching: Caching is actually a requirement closer to the front end. The user’s actions trigger data updates, and the node middle tier can handle some of the caching requirements directly.
- Traffic limiting: The middle layer of node. Traffic limiting can be performed for interfaces or routes.
- Logging: Compared to other server languages, Node middle layer logging makes it easier to locate problems (whether in the browser or server).
- Monitoring: good at high concurrency request processing, monitoring is also a suitable option.
- Authentication: Having an intermediate layer to authenticate is also the realization of a single responsibility.
- Routing: The front end needs to master the permissions and logic of page routing.
- Server-side rendering: Node middle tier solutions are more flexible, such as SSR, template straight out, using some JS libraries to do pre-rendering, etc.
3. Advantages of node forwarding API (Node middle tier)
- In the Java middle tier | PHP data, processing front end in pairs more friendly format
- This solves the cross-domain problem on the front end because server-side requests do not involve cross-domain, which is caused by the same origin policy of the browser
- Multiple requests can be merged through the middle tier to reduce front-end requests
4. How to perform request merge forwarding
- use
express
The middlewaremultifetch
Requests can be consolidated in batches - use
express
+http-proxy-middleware
Interface proxy forwarding is implemented
5. Do not use a third-party module to manually implement a NodeJS proxy server to achieve request merge forwarding
- Implementation approach
The createServer method of the HTTP module of Node is used to set up the HTTP server. The createServer method of the HTTP module is used to set up the HTTP server. The createServer method of the HTTP module is used to receive the request packets sent by the clientCopy the code
- Implementation steps
Step 1: HTTP server setup
const http = require("http");
const server = http.createServer();
server.on('request'.(req,res) = >{
res.end("hello world")
})
server.listen(3000.() = >{
console.log("running");
})
Copy the code
Step 2: Receive the request packets sent by the client to the proxy server
const http = require("http");
const server = http.createServer();
server.on('request'.(req,res) = >{
// Receive the data sent by the client through req's data event and end event
// Use buffer.concat
let postbody = [];
req.on("data".chunk= > {
postbody.push(chunk);
})
req.on('end'.() = > {
let postbodyBuffer = Buffer.concat(postbody);
res.end(postbodyBuffer)
})
})
server.listen(3000.() = >{
console.log("running");
})
Copy the code
This step requires buffer processing in NodeJS as the main data is transferred from the client to the server. The process is to stuff all the chunks of received data into an array and then merge them together to restore the source data. The merge method requires buffer. concat, which does not use the plus sign because it implicitly converts buffers to strings, which is not safe.
Step 3: Use the REQUEST method of HTTP module to send the request message to the target server
The second step has the data uploaded by the client, but the request header is missing, so in this step, the request header is constructed based on the requirements of the request sent by the client, and then sent
const http = require("http");
const server = http.createServer();
server.on("request".(req, res) = > {
var{ connection, host, ... originHeaders } = req.headers;var options = {
"method": req.method,
// With the table to find a site to do the test, the agent site modified here
"hostname": "www.nanjingmb.com"."port": "80"."path": req.url,
"headers": { originHeaders }
}
// Receive the data sent by the client
var p = new Promise((resolve,reject) = >{
let postbody = [];
req.on("data".chunk= > {
postbody.push(chunk);
})
req.on('end'.() = > {
let postbodyBuffer = Buffer.concat(postbody);
resolve(postbodyBuffer)
})
});
// Forwards the data, receives the data returned by the target server, and forwards it to the client
p.then((postbodyBuffer) = >{
let responsebody=[]
var request = http.request(options, (response) = > {
response.on('data'.(chunk) = > {
responsebody.push(chunk)
})
response.on("end".() = >{ responsebodyBuffer = Buffer.concat(responsebody) res.end(responsebodyBuffer); })})// Pass the request body using the write method of request
request.write(postbodyBuffer)
// Use the end method to send the requestrequest.end(); })}); server.listen(3000.() = > {
console.log("runnng");
})
Copy the code