- Dev. To /jorge_rockr…
- Jorge Ramon
- Translator: May Jun, author of “Nodejs Technology Stack”
Node.js is currently one of the most popular techniques for building extensible and efficient REST apis. It can also be used to build hybrid mobile applications, desktop applications and even the Internet of Things.
I really like it and have been working with Node.js for 6 years. This article is intended to be the ultimate guide to how Node.js works.
The world before Node.js
Multithreaded server
Web applications are written in a client/server (client/server) pattern, where the client requests resources from the server and the server responds to them. The server responds only when the client requests it and closes the connection after each response.
This pattern works because each request to the server requires time and resources (memory, CPU, and so on). The server must complete the previous request before it can accept the next.
So, the server only processes one request at a time? Not exactly, when the server receives a new request, the request will be processed by a thread.
In short, a thread is the amount of time and resources the CPU takes to execute a short set of instructions. Having said that, the server processes multiple requests at a time, one per thread (also known as thread-per-request mode).
Note: Thread-per-request means one thread per request.
To process N requests simultaneously, the server needs N threads. If there are now N+1 requests, it must wait until any one of N threads is available.
In the multi-threaded server example, the server allows up to four requests (threads) at a time. When the next three requests are received, they must wait until any one of the four threads is available.
One way around this limitation is to add more resources (memory, CPU cores, etc.) to the server, but that might not be a good idea at all…
Of course, there are technical limitations.
Blocking I/O
The number of threads in the server is not the only issue here. Perhaps you’re wondering why a thread can’t handle 2 or more requests at the same time? This is because the Input/Output operation is blocked.
Suppose you are developing an online store application and it needs a page where users can view all of your products.
A user visiting the yourstore.com/products server will retrieve your entire product from the database to render an HTML file. Easy, right?
But what happens next? .
-
1. When a user accesses /products, a specific method or function needs to be executed to satisfy the request, so a small piece of code parses the URL of the request and locates the correct method or function. The thread is working. ✔ ️
-
2. The method or function and the first line will be executed. The thread is working. ✔ ️
-
3. Since you are a good developer, you will keep all system logs in one file. To ensure that the route executes the correct Method/function, you need to add a string “Method X!!” to your log. This is a blocking I/O operation. The thread is waiting. ❌
-
4. The log is saved and the next line will be executed. The thread is working. ✔ ️
-
5. Now it’s time to go to the database and get all the products, a simple query such as SELECT * FROM Products, but guess what? This is a blocking I/O operation. The thread is waiting. ❌
-
6. You’ll get a list of all the products, but be sure to write them down. The thread is waiting. ❌
-
7. With these products, it is time to render the template, but you should read it before rendering it. The thread is waiting. ❌
-
8. The template engine does its job and sends the response to the client. The thread starts working again. ✔ ️
-
9. Threads are free (idle), like birds. 🕊 ️
How slow are I/O operations? It all depends.
Let’s check the following table:
operation | Number of CPU clock cycles |
---|---|
CPU registers | 3 ticks |
L1 Cache | 8 ticks |
L2 Cache | 12 ticks |
RAM (Random access Memory) | 150 ticks |
Disk | 30000000 ticks |
Network | 250000000 ticks |
Translator: The clock cycle, also known as tick, clock cycle, clock period, etc., refers to a hardware that is divided into multiple time periods. When we need to compare the performance of different hardware, we test the same software on different hardware to observe the clock cycle time and the number of cycles. A longer clock cycle time with more cycles means that the hardware needs less performance.
Disk and network operations are too slow. How many queries or external API calls did your system make?
During recovery, I/O operations cause threads to wait and waste resources.
C10K problem
Back in the early 2000s, server and client machines were slow. The problem is running 10,000 client links simultaneously on a single server machine.
Why doesn’t our traditional “Thread-per-request” model solve this problem? Now let’s do some math.
The native thread implementation allocates approximately 1 MB of memory per thread, so 10K threads require 10GB of RAM, remember this is only in the early 2000s!!
Today, the computing power of servers and clients is better than that, and almost any programming language and framework solves this problem. In practice, the issue has been updated to handle 10 million (10 million) client links (also known as C10M issues) on a single server.
JavaScript to rescue?
Reveal plot alert 🚨🚨🚨!!
Node.js solves this C10K problem… But why?
JavaScript servers were nothing new back in the 2000s, with some implementations on top of Java virtual machines based on the “Thread-per-request” pattern, such as RingoJS and AppEngineJS.
But if that doesn’t solve the C10K problem, why does Node.js? Well, because it’s single-threaded.
Node. Js and Event Loop
Node.js
Node.js is a server-side platform built on top of Google Chrome’s JavaScript engine (V8) that compiles JavaScript code into machine code.
Node.js is based on an event-driven, non-blocking I/O model, which makes it lightweight and efficient. It’s not a framework, it’s not a library, it’s a runtime.
A simple example:
// Importing native http module
const http = require('http');
// Creating a server instance where every call
// the message 'Hello World' is responded to the client
const server = http.createServer(function(request, response) {
response.write('Hello World');
response.end();
});
// Listening port 8080
server.listen(8080);
Copy the code
Non-blocking I/O
Node.js is non-blocking I/O, which means:
- The main thread does not block during I/O operations.
- The server will continue to participate in the request.
- We will use asynchronous code.
Let’s write an example where on each /home request the server responds with an HTML page, otherwise the server responds with a ‘Hello World’ text. To respond to the HTML page, you first read the file.
home.html
<html>
<body>
<h1>This is home page</h1>
</body>
</html>
Copy the code
index.js
const http = require('http');
const fs = require('fs');
const server = http.createServer(function(request, response) {
if (request.url === '/home') {
fs.readFile(`${ __dirname }/home.html`.function (err, content) {
if(! err) { response.setHeader('Content-Type'.'text/html');
response.write(content);
} else {
response.statusCode = 500;
response.write('An error has ocurred');
}
response.end();
});
} else {
response.write('Hello World'); response.end(); }}); server.listen(8080);
Copy the code
If the request url is /home, we use the fs local module to read the home.html file.
The functions passed to http.createserver and fs.readfile are called callbacks. These functions will be performed at some point in the future (the first will be performed when a request is received, the second will be performed after the file has been read and buffered).
Node.js can still handle requests while the file is being read, and even read the file again, all at once in a single thread… but how? !
The Event Loop
Event loops are the magic behind Node.js; in short, an event loop is essentially an infinite loop and is the only one available in a thread.
Libuv is a C language library that implements this pattern and is part of the core module of Node.js. Read more about Libuv here.
The event loop goes through six phases, all of which are called tick.
- Timers: This phase executes the callbacks of timers setTimeout() and setInterval().
- Pending Callbacks: Almost all callbacks are executed here, except for close callbacks, callbacks for the Timers phase, and setImmediate().
- Idle, prepare: This parameter is used only internally.
- Poll: Retrieves new I/O events. Node will block at this point when appropriate.
- The check: setImmediate() callback function is executed here.
- Close callbacks: callback functions that are ready to be closed, such as socket.on(‘close’,…) .
Ok, so there’s only one thread and that thread is an EventLoop, but who does the I/O?
Note 📢 📢 📢!!!!!!
When the Event Loop needs to perform I/O operations, it uses system threads from a pool (via the Libuv library), and when the job is complete, the callbacks are queued up to be executed in the “Pending Callbacks” phase.
Isn’t that perfect?
Cpu-intensive task problems
Node.js seems perfect, and you can build anything you want with it.
Let’s build an API to compute prime numbers.
Prime numbers are also called prime numbers. A natural number greater than 1 is called prime if it is not divisible by any natural number except 1 and itself.
Given a number N, the API must evaluate and return N natural numbers in an array.
primes.js
function isPrime(n) {
for(let i = 2, s = Math.sqrt(n); i <= s; i++)
if(n % i === 0) return false;
return n > 1;
}
function nthPrime(n) {
let counter = n;
let iterator = 2;
let result = [];
while(counter > 0) {
isPrime(iterator) && result.push(iterator) && counter--;
iterator++;
}
return result;
}
module.exports = { isPrime, nthPrime };
Copy the code
index.js
const http = require('http');
const url = require('url');
const primes = require('./primes');
const server = http.createServer(function (request, response) {
const { pathname, query } = url.parse(request.url, true);
if (pathname === '/primes') {
const result = primes.nthPrime(query.n || 0);
response.setHeader('Content-Type'.'application/json');
response.write(JSON.stringify(result));
response.end();
} else {
response.statusCode = 404;
response.write('Not Found'); response.end(); }}); server.listen(8080);
Copy the code
Primes. Js is a prime function implementation, isPrime checks whether the given parameter N isPrime, if it is a prime number nthPrime will return N primes
Index.js creates a service and uses the library every time /primes is requested. Pass parameters through query.
Get 20 former prime, we initiate a request to http://localhost:8080/primes? n=2
Suppose there are three clients accessing this amazing non-blocking API:
- The first one requests the first five primes per second.
- The second requests the first 1,000 primes per second
- The third request inputs the first 10,000,000,000 primes at once, but…
When our third client sends the request, the client will be blocked because the prime library takes up a lot of CPU. The main thread is too busy executing intensive code to do anything else.
But what about Libuv? If you remember that the library uses system threads to help node.js do some I/O operations to avoid blocking the main thread, you are right. This can help us solve this problem, but to use the Libuv library we have to write in C++.
To celebrate, Node.js V10.5 introduced worker threads.
The worker thread
As the document states:
Worker threads are useful for performing CPU-intensive JavaScript operations. They are less useful in I/O intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than worker threads.
Modify the code
Now fix our initialization code:
primes-workerthreads.js
const { workerData, parentPort } = require('worker_threads');
function isPrime(n) {
for(let i = 2, s = Math.sqrt(n); i <= s; i++)
if(n % i === 0) return false;
return n > 1;
}
function nthPrime(n) {
let counter = n;
let iterator = 2;
let result = [];
while(counter > 0) {
isPrime(iterator) && result.push(iterator) && counter--;
iterator++;
}
return result;
}
parentPort.postMessage(nthPrime(workerData.n));
Copy the code
index-workerthreads.js
const http = require('http');
const url = require('url');
const { Worker } = require('worker_threads');
const server = http.createServer(function (request, response) {
const { pathname, query } = url.parse(request.url, true);
if (pathname === '/primes') {
const worker = new Worker('./primes-workerthreads.js', { workerData: { n: query.n || 0}}); worker.on('error'.function () {
response.statusCode = 500;
response.write('Oops there was an error... ');
response.end();
});
let result;
worker.on('message'.function (message) {
result = message;
});
worker.on('exit'.function () {
response.setHeader('Content-Type'.'application/json');
response.write(JSON.stringify(result));
response.end();
});
} else {
response.statusCode = 404;
response.write('Not Found'); response.end(); }}); server.listen(8080);
Copy the code
Index-workerthreads.js Creates an instance of the Worker on each request, and loads and executes the prime-workerthreads.js file in a Worker thread. When the list of primes is evaluated, the message will be triggered, receiving the message and assigning the value to Result. Now that the job has completed, an exit event is emitted again, allowing the main thread to send data to the client.
Primes – workerThreads.js changes a little less. It imports workerData (passing parameters from the main thread), parentPort which is how we send messages to the main thread.
Now let’s do 3 client examples again to see what happens:
The main thread is no longer blocked 🎉🎉🎉🎉 disk disk
It works as expected, but generating worker threads is not a best practice, and creating new threads is not cheap. Always create a thread pool first.
conclusion
Node.js is a powerful technology that’s worth learning.
My advice is always curious and if you know how things are going, you will make better decisions.
Guys, this is it. Hopefully you have some familiarity with Node.js.
Thanks for reading and I’ll see you in the next article. ❤ ️