Using `read` system call on a directory

后端 未结 2 531
猫巷女王i
猫巷女王i 2021-01-19 22:36

I was looking at an example in K&R 2 (8.6 Example - Listing Directories). It is a stripped down version of Linux command ls or Windows\' dir. T

2条回答
  •  野的像风
    2021-01-19 23:20

    In Version 7 UNIX, there was only one unix filesystem, and its directories had a simple on-disk format: array of struct direct. Reading it and interpreting the result was trivial. A syscall would have been redundant.

    In modern times there are many kinds of filesystems that can be mounted by Linux and other unix-like systems (ext4, ZFS, NTFS!), some of which have complex directory formats. You can't do anything sensible with the raw bytes of an arbitrary directory. So the kernel has taken on the responsibility of providing a generic interface to directories as abstract objects. readdir is the central piece of that interface.

    Some modern unices still allow read() on a directory, because it's part of their history. Linux history began in the 90's, when it was already obvious that read() on a directory was never going to be useful, so Linux has never allowed it.

    Linux does provide a readdir syscall, but it's not used very much anymore, because something better has come along: getdents. readdir only returns one directory entry at a time, so if you use the readdir syscall in a loop to get a list of files in a directory, you enter the kernel on every loop iteration. getdents returns multiple entries into a buffer.

    readdir is, however, the standard interface, so glibc provides a readdir function that calls the getdents syscall instead of the readdir syscall. In an ordinary program you'll see readdir in the source code, but getdents in the strace. The C library is helping performance by buffering, just like it does in stdio for regular files when you call getchar() and it does a read() of a few kilobytes at a time instead of a bunch of single-byte read()s.

    You'll never use the original unbuffered readdir syscall on a modern Linux system unless you run an executable that was compiled a long time ago, or go out of your way to bypass the C library.

提交回复
热议问题