What goes on behind the curtains during disk I/O?

后端 未结 2 1124
花落未央
花落未央 2020-12-02 23:41

When I seek to some position in a file and write a small amount of data (20 bytes), what goes on behind the scenes?

My understanding

To my k

相关标签:
2条回答
  • 2020-12-03 00:32

    Indeed, at least on my system with GNU libc, it looks like stdio is reading 4kB blocks before writing back the changed portion. Seems bogus to me, but I imagine somebody thought it was a good idea at the time.

    I checked by writing a trivial C program to open a file, write a small of data once, and exit; then ran it under strace, to see which syscalls it actually triggered. Writing at an offset of 10000, I saw these syscalls:

    lseek(3, 8192, SEEK_SET)                = 8192
    read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1808) = 1808
    write(3, "hello", 5)                    = 5
    

    Seems that you'll want to stick with the low-level Unix-style I/O for this project, eh?

    0 讨论(0)
  • 2020-12-03 00:33

    The C standard library functions perform additional buffering, and are generally optimized for streaming reads, rather than random IO. On my system, I don't observe the spurious reads that Jamey Sharp saw I only see spurious reads when the offset is not aligned to a page size - it could be that the C library always tries to keep its IO buffer aligned to 4kb or something.

    In your case, if you're doing lots of random reads and writes across a reasonably small dataset, you'd likely be best served using pread/pwrite to avoid having to make seeking syscalls, or simply mmaping the dataset and writing to it in memory (likely to be the fastest, if your dataset fits in memory).

    0 讨论(0)
提交回复
热议问题