问题
I'm on a local LAN with only 8 connected computers using a netgear 24 port gigabit switch, network load is really low and send/receive buffers on all involved nodes(running slackware 11) have been set to 16mb. I'm also running tcpdump on each node to monitor the traffic.
A sending node sends a 10044byte large UDP packet which more often than not (3/4 times) does not end up in the receiving side application, in these cases I notice(using tcpdump) that the first x fragments are missing and only the last 3 (all with offsets > 0 and in order) are caught by tcpdump. The fragmented UDP package can therefore not be reassembled and is most likely thrown away.
I find the missing fragments strange since I have also tried a simple load test bursting out 10000 UDP messages of the same size, the receiving application sends a response and all tests so far gives 100% responses back.
Any clues or hints?
回答1:
Update!
After resuming the testing of the above mentioned software I found a repeatable way of recreating the error.
Using windump on the sending windows machine, and tcpdump on the receiving machine, after having left the application idle for some time(~5 minutes), I tried sending the udp message but only end up with a single fragment caught by windump and tcpdump, the 3 remaining fragments are lost. Sending the same message one more time works fine and booth windump and tcpdump catches all 4 fragments and the application on the receiving side gets the message. The pattern is repeatable.
Started searching and found the following information, but to me, still not a clear answer.
http://www.eggheadcafe.com/software/aspnet/32856705/first-udp-message-to-a-sp.aspx
Re examining the logs I now notice the ARP request/reply being sent, which matches one of the ideas given in the link above.
NOTE! I filter windump on the sending side using: "dst host receivernode"
Capture from windump: first failed udp message, should be 4 fragments long
14:52:45.342266 arp who-has receivernode tell sendernode
14:52:45.342599 IP sendernode> receivernode : udp
Capture from windump: second udp message, exactly the same contents, all 4 fragments caught
14:52:54.132383 IP sendernode.10104 > receivernode .10113: UDP, length 6019
14:52:54.132397 IP sendernode> receivernode : udp
14:52:54.132406 IP sendernode> receivernode : udp
14:52:54.132414 IP sendernode> receivernode : udp
14:52:54.132422 IP sendernode> receivernode : udp
14:52:56.142421 arp reply sendernode is-at 00:11:11:XX:XX:fd (oui unknown)
Anyone who has a good idea about whats happening? please elaborate!
来源:https://stackoverflow.com/questions/1324819/missing-udp-fragments-when-monitoring-traffic-with-tcpdump