Is there really no asynchronous block I/O on Linux?

隐身守侯 提交于 2019-11-28 16:43:53

The real answer, which was indirectly pointed to by Peter Teoh, is based on io_setup() and io_submit(). Specifically, the "aio_" functions indicated by Peter are part of the glibc user-level emulation based on threads, which is not an efficient implementation. The real answer is in:

io_submit(2)
io_setup(2)
io_cancel(2)
io_destroy(2)
io_getevents(2)

Note that the man page, dated 2012-08, says that this implementation has not yet matured to the point where it can replace the glibc user-space emulation:

http://man7.org/linux/man-pages/man7/aio.7.html

this implementation hasn't yet matured to the point where the POSIX AIO implementation can be completely reimplemented using the kernel system calls.

So, according to the latest kernel documentation I can find, Linux does not yet have a mature, kernel-based asynchronous I/O model. And, if I assume that the documented model is actually mature, it still doesn't support partial I/O in the sense of recv() vs read().

Peter Teoh

As explained in:

http://code.google.com/p/kernel/wiki/AIOUserGuide

and here:

http://www.ibm.com/developerworks/library/l-async/

Linux does provide async block I/O at the kernel level, APIs as follows:

aio_read    Request an asynchronous read operation
aio_error   Check the status of an asynchronous request
aio_return  Get the return status of a completed asynchronous request
aio_write   Request an asynchronous operation
aio_suspend Suspend the calling process until one or more asynchronous requests have completed (or failed)
aio_cancel  Cancel an asynchronous I/O request
lio_listio  Initiate a list of I/O operations

And if you asked who are the users of these API, it is the kernel itself - just a small subset is shown here:

./drivers/net/tun.c (for network tunnelling):
static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,

./drivers/usb/gadget/inode.c:
ep_aio_read(struct kiocb *iocb, const struct iovec *iov,

./net/socket.c (general socket programming):
static ssize_t sock_aio_read(struct kiocb *iocb, const struct iovec *iov,

./mm/filemap.c (mmap of files):
generic_file_aio_read(struct kiocb *iocb, const struct iovec *iov,

./mm/shmem.c:
static ssize_t shmem_file_aio_read(struct kiocb *iocb,

etc.

At the userspace level, there is also the io_submit() etc API (from glibc), but the following article offer an alternative to using glibc:

http://www.fsl.cs.sunysb.edu/~vass/linux-aio.txt

It directly implement the API for functions like io_setup() as direct syscall (bypassing glibc dependencies), a kernel mapping via the same "__NR_io_setup" signature should exist. Upon searching the kernel source at:

http://lxr.free-electrons.com/source/include/linux/syscalls.h#L474 (URL is applicable for the latest version 3.13) you are greeted with the direct implementation of these io_*() API in the kernel:

474 asmlinkage long sys_io_setup(unsigned nr_reqs, aio_context_t __user *ctx);
475 asmlinkage long sys_io_destroy(aio_context_t ctx);
476 asmlinkage long sys_io_getevents(aio_context_t ctx_id,
481 asmlinkage long sys_io_submit(aio_context_t, long,
483 asmlinkage long sys_io_cancel(aio_context_t ctx_id, struct iocb __user *iocb,

The later version of glibc should make these usage of "syscall()" to call sys_io_setup() unnecessary, but without the latest version of glibc, you can always make these call yourself if you are using the later kernel with these capabilities of "sys_io_setup()".

Of course, there are other userspace option for asynchronous I/O (eg, using signals?):

http://personal.denison.edu/~bressoud/cs375-s13/supplements/linux_altIO.pdf

or perhap:

What is the status of POSIX asynchronous I/O (AIO)?

"io_submit" and friends are still not available in glibc (see io_submit manpages), which I have verified in my Ubuntu 14.04, but this API is linux-specific.

Others like libuv, libev, and libevent are also asynchronous API:

http://nikhilm.github.io/uvbook/filesystem.html#reading-writing-files

http://software.schmorp.de/pkg/libev.html

http://libevent.org/

All these API aimed to be portable across BSD, Linux, MacOSX, and even Windows.

In terms of performance I have not seen any numbers, but suspect libuv may be the fastest, due to its lightweightedness?

https://ghc.haskell.org/trac/ghc/ticket/8400

(2019) If you're using a 5.1 or above kernel you can use the io_uring interface for file-like I/O and get excellent asynchronous operation.

Compared to the existing libaio/KAIO interface io_uring has the following advantages:

  • Works with buffered AND direct I/O
  • Easier to use
  • Can optionally work in a polled manner
  • Less bookkeeping space overhead per I/O
  • Lower CPU overhead due to less userspace/kernel syscall context switches (a big deal these days due to the impact of spectre/meltdown mitigations)
  • Doesn't become blocking each time the stars aren't perfectly aligned

Compared to glibc's POSIX aio io_uring has the following advantages:

The "Efficient IO with io_uring" document goes into far more detail as to io_uring's benefits and usage.

I'm not quite sure "support partial I/O in the sense of recv() vs read()" makes so much sense for file-based I/O. Ideally you're asking for read I/O in sizes the disk can actually do so the buffer is either completely ready or not ready at all (or the benefit of doing re-assembly yourself isn't worth the effort because the disk is so fast).

Obviously at the time of writing the io_uring interface is very new but hopefully it will usher in a better asynchronous file-based I/O story for Linux.

For network socket i/o, when it is "ready", it don't block. That's what the O_NONBLOCK and "ready" means.

For disk i/o, we have posix aio, linux aio, sendfile and friends.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!