问题
Assume we have a file of FILE_SIZE
bytes, and:
FILE_SIZE <= min(page_size, physical_block_size)
;- file size never changes (i.e.
truncate()
or appendwrite()
are never performed); file is modified only by completly overwriting its contents using:
pwrite(fd, buf, FILE_SIZE, 0);
Is it guaranteed on ext4
that:
- Such writes are atomic with respect to concurrent reads?
Such writes are transactional with respect to a system crash?
(i.e., after a crash the file's contents is completely from some previous write and we'll never see a partial write or empty file)
Is the second true:
- with
data=ordered
? with
data=journal
or alternatively with journaling enabled for a single file?(using
ioctl(fd, EXT4_IOC_SETFLAGS, EXT4_JOURNAL_DATA_FL)
)when
physical_block_size < FILE_SIZE <= page_size
?
I've found related question which links discussion from 2011. However:
- I didn't find an explicit answer for my question
2
. - I wonder, if the above is true, is it documented somewhere?
回答1:
From my experiment it was not atomic.
Basically my experiment was to have two processes, one writer and one reader. The writer writes to a file in a loop and reader reads from the file
Writer Process:
char buf[][18] = {
"xxxxxxxxxxxxxxxx",
"yyyyyyyyyyyyyyyy"
};
i = 0;
while (1) {
pwrite(fd, buf[i], 18, 0);
i = (i + 1) % 2;
}
Reader Process
while(1) {
pread(fd, readbuf, 18, 0);
//check if readbuf is either buf[0] or buf[1]
}
After a while of running both processes, I could see that the readbuf
is either xxxxxxxxxxxxxxxxyy
or yyyyyyyyyyyyyyyyxx
.
So it definitively shows that the writes are not atomic. In my case 16byte writes were always atomic.
The answer was: POSIX doesn't mandate atomicity for writes/reads except for pipes. The 16 byte atomicity that I saw was kernel specific and may/can change in future.
Details of the answer in the actual post: write(2)/read(2) atomicity between processes in linux
回答2:
I am familiar with theory about filesystems in general, not with implementation of Ext4. Take this as educated guess.
Yes, I believe one sector reads and writes will be atomic because
- Link you provided quotes "Currently concurrent reads/writes are atomic only wrt individual pages, however are not on the system call. "
- Disk sector (512 bytes) writes are atomic according to Stephen Tweedie. In private email conversation with him, he acknowledged that this guarantee is only as good as the hardware.
- Ext filesystems overwrite data in place, no copy on write. No allocation.
- There is some effort to implement inline data, very small files data can fit in the inode itself. If you only need to store few bytes, that may have impact.
Not sure about one page, but it would make little sense in full journaling mode to send less than a page to the journal before commiting.
来源:https://stackoverflow.com/questions/32851672/is-overwriting-a-small-file-atomic-on-ext4