c++ posix sockets recv functionality

问题

I have a perhaps noobish question to ask, I've looked around but haven't seen a direct answer addressing it and thought I might get a quick answer here. In a simple TCP/IP client-server select loop using bsd sockets, if a client sends two messages that arrive simultaneously at a server, would one call to recv at the server return both messages bundled together in the buffer, or does recv force each distinct arriving message to be read separately?

I ask because I'm working in an environment where I can't tell how the client is building its messages to send. Normally recv reports that 12 bytes are read, then 915, then 12 bytes, then 915, and so on in such an alternating 12 to 915 pattern... but then sometimes it reports 927 (which is 915+12). I was thinking that either the client is bundling some of it's information together before it sends it out to the server, or that the messages arrive before recv is invoked and then recv pulls all the pending bytes simultaneously. So I wanted to make sure I understood recv's behavior properly. I think perhaps I'm missing something here in my understanding, and I hope someone can point it out, thanks!

回答1:

TCP/IP is a stream-based transport, not a datagram-based transport. In a stream, there is no 1-to-1 correlation between send() and recv(). That is only true for datagrams. So, you have to be prepared to handle multiple possibilities:

a single call to send() may fit in a single TCP packet and be read in full by a single call to recv().
a single call to send() may span multiple TCP packets and need multiple calls to recv() to read everything.
multiple calls to send() may fit in a single TCP packet and be read in full by a single call to recv().
multiple calls to send() may span multiple TCP packets and require multiple calls to recv() for each packet.

To illustrate this, consider two messages are being sent - send("hello", 5) and send("world", 5). The following are a few possible combinations when calling recv():

"hello" "world"
"hel" "lo" "world"
"helloworld"
"hel" "lo" "worl" "d"
"he" "llow" "or" "ld"

Get the idea? This is simply how TCP/IP works. Every TCP/IP implementation has to account for this fragementation.

In order to receive data properly, there has to be a clear separation between logical messages, not individual calls to send(), as it may take multiple calls to send() to send a single message, and multiple recv() calls to receive a single message in full. So, taking the earlier example into account, let's add a separator between the messages:

send("hello\n", 6);

send("world", 5);
send("\n", 1);

On the receiving end, you would call recv() as many times as it takes until a \n character is received, then you would process everything you had received leading up to that character. If there is any read data left over when finished, save it for later processing and start calling recv() again until the next \n character, and so on.

Sometimes, it is not possible to place a unique character between messages (maybe the message body allows all characters to be used, so there is no distinct character available to use as a separator). In that case, you need to prefix the message with the message's length instead, either as a preceeding integer, a structured header, etc. Then you simply call recv() as many times as needed until you have received the full integer/header, then you call recv() as many times as needed to read just as many bytes as the length/header specifies. When finished, save any remaining data if needed, and start calling recv() all over again to read the next message length/header, and so on.

回答2:

It is definitely valid for both messages to be returned in a single recv call (see Nagle's Algorithm). TCP/IP guarantees order (the bytes from the messages won't be mixed). In addition to them being returned together in a single call, it is also possible for a single message to require multiple calls to recv (although it would be unlikely with packets as small as described).

回答3:

The only thing you can count on is the order of the bytes. You cannot count on how they are partitioned into recv calls. Sometimes things get merged either at the endpoint or along the way. Things can also get broken up along the way and so arrive independently. It does sound like your sender is sending alternating 12 and 915 but you can't count on it.

来源：https://stackoverflow.com/questions/14286558/c-posix-sockets-recv-functionality

标签

sockets

recv