I have an update method which gets called about every 16-40ms, and inside I have this code:
this.fs.writeFile(\"./data.json\", JSON.stringify({
totalPlay
fs.writeFile is not an atomic operation. Here is an example program which I will run strace
on:
#!/usr/bin/env node
const { writeFile, } = require('fs');
// nodejs won’t exit until the Promise completes.
new Promise(function (resolve, reject) {
writeFile('file.txt', 'content\n', function (err) {
if (err) {
reject(err);
} else {
resolve();
}
});
});
When I run that under strace -f
and tidied up the output to show just the syscalls from the writeFile
operation (which spans multiple IO threads, actually), I get:
open("file.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 9
pwrite(9, "content\n", 8, 0) = 8
close(9) = 0
As you can see, writeFile
completes in three steps.
My strace
had each of these steps occurring on a different node IO thread. This suggests to me that fs.writeFile()
might actually be implemented in terms of fs.open(), fs.write(), and fs.close(). Thus, nodejs does not treat this complex operation like it is atomic at any level—because it isn’t. Therefore, if your node process terminates, even gracefully, without waiting for the operation to complete, the operation could be at any of the steps above. In your case, you are seeing your process exit after writeFile()
finishes step 1 but before it completes step 2.
The common pattern for transactionally replacing a file’s contents with a POSIX layer is to use these steps:
close()
it.Using this algorithm, the destination file is either updated or not regardless of when your program terminates. And, even better, journalled (modern) filesystems will ensure that, as long as you fsync()
the file in step 1 before proceeding to step 2, the two operations will occur in order. I.e., if your program performs step 1 and then step 2 but you pull the plug, when you boot up you will find the filesystem in one of the following states:
writeFile()
algorithm, open()
, effectively never succeeded), existent but empty (step 1 of writeFile()
algorithm completed), or existent with some data (step 2 of writeFile()
algorithm partially completed).The code to use this pattern might look like the following:
const { writeFile, rename, } = require('fs');
function writeFileTransactional (path, content, cb) {
// The replacement file must be in the same directory as the
// destination because rename() does not work across device
// boundaries.
// This simple choice of replacement filename means that this
// function must never be called concurrently with itself for the
// same path value. Also, properly guarding against other
// processes trying to use the same temporary path would make this
// function more complicated. If that is a concern, a proper
// temporary file strategy should be used. However, this
// implementation ensures that any files left behind during an
//unclean termination will be cleaned up on a future run.
let temporaryPath = `${path}.new`;
writeFile(temporaryPath, content, function (err) {
if (err) {
return cb(err);
}
rename(temporaryPath, path, cb);
});
};
This is basically the same solution you’d use for the same problem in any langage/framework.
I didn't run some real tests with this I just noticed with manually reloading my ide that sometime the file was empty. What I tried first was the rename method and noted the same problem, but recreating a new file was less desirable (considering file watches etc.).
My suggestion or what I'm doing now is in your own readFileSync I check if the file is missing or data returned is empty and sleep for a 100 milliseconds before giving it another try. I suppose a third try with more delay would really push the sigma up a notch but currently not going do it as the added delay is hopefully an unnecessary negative (would consider a promise at that point). There are other recovery option opportunities relative to your own code you can add just in case I hopefully. File not found or empty? is basically a retry another way.
My custom writeFileSync has an added flag to toggle between using the rename method (with write sub-dir '._new' creation) or the normal direct method as your code's need may vary. Possible based on file size is my recommendation.
In this use case the files are small and only updated by one node instance / server at a time. I can see adding the random file name as another option with rename to allow multiple machines to write another option for later if needed. Maybe a retry limit argument as well?
I was also thinking that you could write to a local temp and then copy to share target by some means (maybe also rename on target for speed increase), and then clean up (unlink from local temp) of course. I guess that idea is kind of pushing it to shell commands so not better. Anyway still the main idea here is to read twice if found empty. I'm sure it's safe from being partially written, via nodejs 8+ on to a shared Ubuntu type NFS mount right?
if the error is caused due to bad input (the data you want to write) then make sure the data is as they should and then do the writeFile. if the error is caused due to failure of the writeFile even though the input is Ok, you could check that the function is executed until the file is written. One way is using the async doWhilst function.
async.doWhilst(
writeFile(), //your function here but instead of err when fail callback success to loop again
check_if_file_null, //a function that checks that the file is not null
function (err) {
//here the file is not null
}
);