1 the introduction
How to Watch for Files Changes in Node.js
If you want to use an off-the-shelf library, chokidar or Node-Watch is recommended. Read on to see how it works.
2 an overview
Using the fs. Watchfile
Using the fs built-in function Watchfile seems to solve the problem:
fs.watchFile(dir, (curr, prev) = > {});
Copy the code
However, you may find that this callback execution is somewhat delayed because Watchfile detects file changes through polling, it doesn’t respond in real time, and it can only listen to one file, which is inefficient.
Using the fs. Watch
It is better to use another built-in fs function, Watch:
fs.watch(dir, (event, filename) = > {});
Copy the code
Watch uses inotify on Linux, FSEvents on macOS, and ReadDirectoryChangesW on Windows through the file change notification mechanism provided by operating systems, and can be used to listen for directory changes. In the listening folder scenario, it is much more efficient than creating N fs.watchfiles.
$node file-watcher.js [2018-05-21t00:55:52.588z] Watchingfor/button-presses. Log [2018-05-21t00:56:00.773z] button-presses. Log file Changed [2018-05-21T00:56:00.793z] Button-presses. Log file Changed [2018-05-21T00:56:00.802z] Button-presses [the 2018-05-21 T00:56:00. 813 z] button - presses. The log file ChangedCopy the code
But when we modify a file, the callback executes 4 times! The reason is that when a file is written, multiple writes may be triggered, even if only one is saved. But we don’t need such sensitive callbacks, because it’s often thought that a save is a change, and we don’t care how many times a file is written underneath the system.
Therefore, we can further determine whether the triggering state is change:
fs.watch(dir, (event, filename) = > {
if (filename && event === "change") {
console.log(`${filename} file Changed`); }});Copy the code
This solves the problem to some extent, but the authors found that Raspbian does not support rename events, which makes such a judgment meaningless if they are categorized as change.
Fs. watch uses apis provided by each platform, so there is no guarantee that the API implementation rules will be consistent.
Optimization plan 1: Compare the file modification time
Based on fs.watch, the judgment of modification time is added:
let previousMTime = new Date(0);
fs.watch(dir, (event, filename) = > {
if (filename) {
const stats = fs.statSync(filename);
if (stats.mtime.valueOf() === previousMTime.valueOf()) {
return;
}
previousMTime = stats.mtime;
console.log(`${filename} file Changed`); }});Copy the code
We went from four logs to three logs, but we still have problems. We think the modification is only when the content of the file changes, but the operating system considers more factors, so we try to compare whether the content of the file changes.
I add that other open source editors may empty files before writing, which also affects the number of times the callback is triggered.
Optimization scheme 2: Verify MD5 files
The change is considered to have been triggered only if the contents of the file have changed.
let md5Previous = null;
fs.watch(dir, (event, filename) = > {
if (filename) {
const md5Current = md5(fs.readFileSync(buttonPressesLogFile));
if (md5Current === md5Previous) {
return;
}
md5Previous = md5Current;
console.log(`${filename} file Changed`); }});Copy the code
I finally have two logs instead of three. Why do I have one more log? The possible reason is that the system may trigger multiple callback events during the file saving process, perhaps with an intermediate state.
Optimization plan 3: add a delay mechanism
We try to delay the judgment by 100 ms, perhaps avoiding the intermediate state:
let fsWait = false;
fs.watch(dir, (event, filename) = > {
if (filename) {
if (fsWait) return;
fsWait = setTimeout((a)= > {
fsWait = false;
}, 100);
console.log(`${filename} file Changed`); }});Copy the code
So now the log becomes one. Many NPM packages use the debounce function here to control the trigger frequency before the trigger frequency is corrected.
In addition, we need to combine MD5 and delay mechanism to get relatively accurate results:
let md5Previous = null;
let fsWait = false;
fs.watch(dir, (event, filename) = > {
if (filename) {
if (fsWait) return;
fsWait = setTimeout((a)= > {
fsWait = false;
}, 100);
const md5Current = md5(fs.readFileSync(dir));
if (md5Current === md5Previous) {
return;
}
md5Previous = md5Current;
console.log(`${filename} file Changed`); }});Copy the code
3 intensive reading
The author discusses some basic ways to implement folder listening. As you can see, fs.watch, which uses native apis for each platform, is not very reliable, but it is the only way to listen to files, so a number of optimizations need to be made based on it.
In actual scenarios, you need to distinguish folders from files, soft connections, and read and write permissions.
In addition, libraries used in production environments generally use 50 to 100 milliseconds to solve the problem of repeated firing.
Therefore, both Chokidar and Node-Watch make extensive use of the techniques mentioned in this article, plus the processing of boundary conditions, soft connections, permissions, etc., to provide a more accurate callback by taking all possible situations into account.
For example, polling is also required to determine whether the file writing operation is complete:
function awaitWriteFinish() {
/ /... omit
fs.stat(
fullPath,
function(err, curStat) {
/ /... omit
if(prevStat && curStat.size ! = prevStat.size) {this._pendingWrites[path].lastChange = now;
}
if (now - this._pendingWrites[path].lastChange >= threshold) {
delete this._pendingWrites[path];
awfEmit(null, curStat);
} else {
timeoutHandler = setTimeout(
awaitWriteFinish.bind(this, curStat),
this.options.awaitWriteFinish.pollInterval
);
}
}.bind(this));/ /... omit
}
Copy the code
As you can see, third-party NPM libraries have taken an approach that does not trust operating system callbacks and have completely rewritten the judgment logic based on file information.
Therefore, if we trust the callback of the operating system, we cannot erase the differences between all operating systems. Only by unifying the logic of “write”, “delete” and “modify” of the rewrite file, can we ensure the compatibility of all platforms.
4 summarizes
It is easy to listen for folder changes with NodeJS, but it is difficult to provide accurate callbacks for two main reasons:
- Smooth out the differences between operating systems, which need to be combined
fs.watch
At the same time, add some extra check mechanism and delay mechanism. - Distinguish between operating system expectation and user expectation. For example, additional operations of the editor and multiple reads and writes of the operating system should be ignored. The user’s expectation will be less frequent, and the continuous triggering within a very small time period will be ignored.
There are other factors to consider, such as compatibility, permissions, and soft connections. Fs.watch is not an engineering level API available out of the box.
5 More Discussions
How to use Nodejs to listen to folders · Issue #87 · dt-fe/weekly
If you’d like to participate in the discussion, pleaseClick here to, with a new theme every week, released on weekends or Mondays.