The problem background
One morning, the operation classmate suddenly in the group feedback a lot of users to report the login problem. At first, it was thought that the Intranet interface service was abnormal, but the interface reported that no abnormal log was generated. That is to say, the abnormal request has not been dialed. So we log in to the server and filter the node.js service logs:
Based on logs, you can intuitively see the problem: DNS resolution fails
To compose
As a Node.js service with daily traffic of over 10 million, each request needs to resolve N Intranet interface domain names.
In normal times, if the DNS service is faulty or network jitter occurs, online services may become unavailable when both the Node.js service and Intranet interface service are normal
In this case, we need a layer of caching for DNS resolution on the Node.js server
First of all we need to be clear:!! Node.js itself does not cache DNS query results!!
Default DNS query scheme
Let’s first look at the default DNS lookup scheme:
The built-in HTTP module of Node.js uses dns.lookup() to lookup HTTP. Request ()
- Request () -> net.createconnection () -> dns.lookup()
function lookupAndConnect(self, options) {
// ...
const lookup = options.lookup || dns.lookup;
defaultTriggerAsyncIdScope(self[async_id_symbol], function() {
lookup(host, dnsopts, function emitLookup(err, ip, addressType) {
self.emit('lookup', err, ip, addressType, host);
// ...
});
});
}
Copy the code
The options.lookup argument can be set by passing in either dns.resolve or a custom method that meets the requirements
The getaddrinfo function
The dns.lookup() method is called to the end, calling the underlying getAddrInfo () function.
In C/C++ code, the getAddrinfo function is called synchronously, so you need libuv to implement node.js asynchronous I/O through the thread pool
Note: Looking back, we can see that the default thread pool size is 4
This can be set using the UV_THREADPOOL_SIZE environment variable. Node.js V14 has a maximum of 1024
Problems that might arise
When a request takes a long time in the DNS query phase, the service processing speed is not matched by the number of requests because the default thread pool is too small. The longer the service runs, the more the backlog of requests and connections will increase
About the Default cache
-
!!!!!!!!! Node.js itself does not cache DNS query results!! , node.js requests the DNS Server for each domain name request
-
Using the DNS cache Note the expiration time of the cache
Implement DNS cache dependencies
lookup-dns-cache
Lookup -dns-cache is a very mature DNS cache library, but it is quite old
His idea was simple:
- The underlying query uses dns.resolve() instead of dns.lookup
- Through a
Map
The cache has been resolvedhostname
information - Avoid parallel DNS requests. Only one query request for the same hostname is executed at a time, using Map
dns.resolve
与 dns.lookup
The difference between
You can see it in the official documents
dns.resolve
Do not usegetaddrinfo()
dns.resolve
It is implemented asynchronouslydns.resolve
Do not parse the local hosts file
Details can be found at nodejs.org/dist/latest…
The sample code
const { resolve, lookup } = require('dns'); Lookup ('preview4.xx.xx.com', (err, address, family) => {console.log(' address: %j address family: IPv%s', address, family); // Address: "xxx.xxx.xx.xx" address family: IPv4}); resolve('preview4.xx.xx.com', (err, records) => { console.log(records); // undefined });Copy the code
Preview4.xx.xx.com is the domain name configured for the local host.
Because dns.resolve() does not use getAddrInfo (), the address resolved at this point is undefined
Avoid parallel request implementations
Use Map to cache the hostname being queried. After the query is complete, the Map is deleted
let task = this._tasksManager.find(key);
if (task) {
task.addResolvedCallback(callback);
} else {
task = new ResolveTask(hostname, ipVersion);
this._tasksManager.add(key, task);
task.on('addresses', addresses => {
this._addressCache.set(key, addresses);
});
task.on('done', () => {
this._tasksManager.done(key);
});
task.addResolvedCallback(callback);
task.run();
}
Copy the code
Ttl-based caching
- through
dns.reslove
Methods set upttl:true
Let the DNS query result return TTL value - If the hostname is still in the cache and has not expired, return to the cache directly; otherwise, query
/** * @param {string} key * @returns {Address[]|undefined} */ find(key) { if (! this._cache.has(key)) { return; } const addresses = this._cache.get(key); if (this._isExpired(addresses)) { return; } return addresses; }Copy the code
cacheable-lookup
In practice, it is found that dns.resolve() cannot resolve the domain name configured in the local hosts. Using only lookup-dns-cache will cause an error in the local development environment.
After research, I found the cacheable- Lookup library, the author is the author of GOT Szmarczak.
As we can see from the commit record, the author is still keeping the library updated.
Contrast the lookup – DNS cache
async queryAndCache(hostname) { if (this._hostnamesToFallback.has(hostname)) { return this._dnsLookup(hostname, all); } let query = await this._resolve(hostname); if (query.entries.length === 0 && this._dnsLookup) { query = await this._lookup(hostname); if (query.entries.length ! == 0 && this.fallbackDuration > 0) { // Use `dns.lookup(...) ` for that particular hostname this._hostnamesToFallback.add(hostname); } } const cacheTtl = query.entries.length === 0 ? this.errorTtl : query.cacheTtl; await this._set(hostname, query.entries, cacheTtl); return query.entries; }Copy the code
If the resolve method does not resolve successfully, the lookup method will be used. This is more consistent with the hosts file change scenario in our local development environment
The library also provides functionality points such as TTL-based caching, blocking parallel requests, and so on
So we can solve our problem!
The solution
Cacheable -lookup is added to the cache for DNS resolution at all calls to interfaces on the Intranet