Race condition when using dup2

前端 未结 2 567
独厮守ぢ
独厮守ぢ 2021-02-02 17:03

This manpage for the dup2 system call says:

EBUSY (Linux only) This may be returned by dup2() or dup3() during a race condition with open(2

2条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-02-02 17:44

    There is an explanation in fs/file.c, do_dup2():

    /*
     * We need to detect attempts to do dup2() over allocated but still
     * not finished descriptor.  NB: OpenBSD avoids that at the price of
     * extra work in their equivalent of fget() - they insert struct
     * file immediately after grabbing descriptor, mark it larval if
     * more work (e.g. actual opening) is needed and make sure that
     * fget() treats larval files as absent.  Potentially interesting,
     * but while extra work in fget() is trivial, locking implications
     * and amount of surgery on open()-related paths in VFS are not.
     * FreeBSD fails with -EBADF in the same situation, NetBSD "solution"
     * deadlocks in rather amusing ways, AFAICS.  All of that is out of
     * scope of POSIX or SUS, since neither considers shared descriptor
     * tables and this condition does not arise without those.
     */
    fdt = files_fdtable(files);
    tofree = fdt->fd[fd];
    if (!tofree && fd_is_open(fd, fdt))
        goto Ebusy;
    

    Looks like EBUSY is returned when the descriptor to be freed is in some kind of incomplete state when it's still being opened (fd_is_open but not present in fdtable).

    EDIT (more info and do want bounty)

    In order to understand how !tofree && fd_is_open(fd, fdt) can happen, let's see how files are opened. Here a simplified version of sys_open :

    long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
    {
        /* ... irrelevant stuff */
        /* allocate the fd, uses a lock */
        fd = get_unused_fd_flags(flags);
        /* HERE the race condition can arise if another thread calls dup2 on fd */
        /* do the real VFS stuff for this fd, also uses a lock */
        fd_install(fd, f);
        /* ... irrelevant stuff again */
        return fd;
    }
    

    Basically two very important things happen: a file descriptor is allocated and only then it is actually opened by the VFS. These two operations modify the fdt of the process. They both use a lock, so nothing bad is to expect inside those two calls.

    In order to memorize which fds have been allocated a bit vector called open_fds is used by the fdt. After get_unused_fd_flags(), the fd has been allocated and the corresponding bit set in open_fds. The lock on the fdt has been released, but the real VFS job hasn't been done yet.

    At this precise moment, another thread (or another process in the case of shared fdt) can call dup2 which will not block because the locks have been released. If the dup2 took its normal path here, the fd would be replaced, but fd_install would be still run for the old file. Hence the check and return of Ebusy.

    I found additional info on this race condition in the comments of fd_install() which confirms my explanation:

    /* The VFS is full of places where we drop the files lock between
     * setting the open_fds bitmap and installing the file in the file
     * array.  At any such point, we are vulnerable to a dup2() race
     * installing a file in the array before us.  We need to detect this and
     * fput() the struct file we are about to overwrite in this case.
     *
     * It should never happen - if we allow dup2() do it, _really_ bad things
     * will follow. */
    

提交回复
热议问题