Minimizing copies when writing large data to a socket

强颜欢笑 提交于 2019-12-03 03:45:52
Rajiv

It seems like my suspicions were correct. I got my information from this article. Quoting from it:

Also these network write system calls, including sendfile, might and in many cases do return before the data sent over TCP by the method call has been acknowledged. These methods return as soon as all data is written into the socket buffers (sk buff) and is pushed to the TCP write queue, the TCP engine can manage alone from that point on. In other words at the time sendfile returns the last TCP send window is not actually sent to the remote host but queued. In cases where scatter-gather DMA is supported there is no seperate buffer which holds these bytes, rather the buffers(sk buffs) just hold pointers to the pages of OS buffer cache, where the contents of file is located. This might lead to a race condition if we modify the content of the file corresponding to the data in the last TCP send window as soon as sendfile is returned. As a result TCP engine may send newly written data to the remote host instead of what we originally intended to send.

Provided the buffer from a mmapped file is even considered "DMA-able", seems like there is no way to know when it is safe to re-use it without an explicit acknowledgement (over the network) from the actual client. I might have to stick to simple write calls and incur the extra copy. There is a paper (also from the article) with more details.

Edit: This article on the splice call also shows the problems. Quoting it:

Be aware, when splicing data from a mmap'ed buffer to a network socket, it is not possible to say when all data has been sent. Even if splice() returns, the network stack may not have sent all data yet. So reusing the buffer may overwrite unsent data.

For cases 1 and 2 - does the operation you marked as // Store image data in this buffer require any conversion? Is it just plain copy from the memory to buf?

If it's just plain copy, you can use write directly on the pointer obtained from jemalloc.

Assuming that img is a pointer obtained from jemalloc and size is a size of your image, just run following code:

int result;
int sent=0;
while(sent<size) {
    result=write(socket,img+sent,size-sent);
    if(result<0) {
        /* error handling here */
        break;
    }
    sent+=result;
}

It is working correctly for blocking I/O (the default behavior). If you need to write a data in a non-blocking manner, you should be able to rework the code on your own, but now you have the idea.

For case 3 - sendfile is for sending data from one descriptor to another. That means you can, for example, send data from file directly to tcp socket and you don't need to allocate any additional buffer. So, if the image you want to send to a client is in a file, just go for a sendfile. If you have it in memory (because you processed it somehow, or just generated), use the approach I mentioned earlier.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!