Using `read` system call on a directory

问题

I was looking at an example in K&R 2 (8.6 Example - Listing Directories). It is a stripped down version of Linux command ls or Windows' dir. The example shows an implementation of functions like opendir, readdir. I've tried and typed the code word-by-word but it still doesn't work. All it does is that it prints the a dot (for the current directory) and exits.

One interesting thing I found in the code (in the implementation of readdir) was that it was calling the system calls like open and read on directory. Something like -

int fd, n;
char buf[1000], *bufp;

bufp = buf;
fd = open("dirname", O_RDONLY, 0);
n = read(fd, bufp, 1000);
write(fd, bufp, n);

When I run this code I get no output even when the folder name "dirname" has some files in it.

Also, the book says, that the implementation is for Version 7 and System V UNIX systems. Is that the reason why it is not working on Linux?

Here is the code- http://ideone.com/tw8ouX.

So does Linux not allow read system calls on directories? Or something else is causing this?

回答1:

In Version 7 UNIX, there was only one unix filesystem, and its directories had a simple on-disk format: array of struct direct. Reading it and interpreting the result was trivial. A syscall would have been redundant.

In modern times there are many kinds of filesystems that can be mounted by Linux and other unix-like systems (ext4, ZFS, NTFS!), some of which have complex directory formats. You can't do anything sensible with the raw bytes of an arbitrary directory. So the kernel has taken on the responsibility of providing a generic interface to directories as abstract objects. readdir is the central piece of that interface.

Some modern unices still allow read() on a directory, because it's part of their history. Linux history began in the 90's, when it was already obvious that read() on a directory was never going to be useful, so Linux has never allowed it.

Linux does provide a readdir syscall, but it's not used very much anymore, because something better has come along: getdents. readdir only returns one directory entry at a time, so if you use the readdir syscall in a loop to get a list of files in a directory, you enter the kernel on every loop iteration. getdents returns multiple entries into a buffer.

readdir is, however, the standard interface, so glibc provides a readdir function that calls the getdents syscall instead of the readdir syscall. In an ordinary program you'll see readdir in the source code, but getdents in the strace. The C library is helping performance by buffering, just like it does in stdio for regular files when you call getchar() and it does a read() of a few kilobytes at a time instead of a bunch of single-byte read()s.

You'll never use the original unbuffered readdir syscall on a modern Linux system unless you run an executable that was compiled a long time ago, or go out of your way to bypass the C library.

回答2:

In fact Linux dosn't allow read for directories. See man page and search for errno EISDIR. You will find

The read() and pread() functions shall fail if ...

The fildes argument refers to a directory and the implementation does not allow the directory to be read using read() or pread(). The readdir() function should be used instead.

. Other UNIXes allow it nevertheless.

来源：https://stackoverflow.com/questions/17618472/using-read-system-call-on-a-directory

标签

operating-system