my (network) client sends 50 to 100 KB data packets every 200ms to my server. there\'re up to 300 clients. Server sends nothing to client. Server (dedicated) and clients are in
I did a study on this a couple of years ago wth 1700 data points. The conclusion was that the single best thing you can do is configure an enormous socket receive buffer (e.g. 512k) at the receiver. Do that to the listening socket, so it will be inherited by the accepted sockets, so it will already be set while they are handshaking. That in turn allows TCP window scaling to be negotiated during the handshake, which allows the client to know about the window size > 64k. The enormous window size basically lets the client transmit at the maximum possible rate, subject only to congestion avoidance rather than closed receive windows.
What OS? IPv4 or v6? Why so large of a dump ; why can't it be broken down?
Assuming a solid, stable, low bandwidth:delay prod, you can adjust things like inflight sizing, initial window size, mtu (depending on the data, IP version, and mode[tcp/udp].
You could also round robin or balance inputs, so you have less interrupt time from the nic .. binding is an option as well..
5MB /packet/? That's a pretty poor design .. I would think it'd lead to a lot of segment retrans's , and a LOT of kernel/stack mem being used in sequence reconstruction / retransmits (accept wait time, etc)..
(Is that even possible?)
Since all clients are in LAN, you might try enabling "jumbo frames" (need to run a netsh command for that, would need to google for the precise command, but there are plenty of how-tos).
On the application layer, you could use TransmitFile, which is the Windows sendfile equivalent and which works very well under Windows Server 2003 (it is artificially rate-limited under "non server", but that won't be a problem for you). Note that you can use a memory mapped file if you generate the data on the fly.
As for tuning parameters, increasing the send buffer will likely not give you any benefit, though increasing the receive buffer may help in some cases because it reduces the likelihood of packets being dropped if the receiving application does not handle the incoming data fast enough. A bigger TCP window size (registry setting) may help, as this allows the sender to send out more data before having to block until ACKs arrive.
Yanking up the program's working set quota may be worth a consideration, it costs you nothing and may be an advantage, since the kernel needs to lock pages when sending them. Being allowed to have more pages locked might make things faster (or might not, but it won't hurt either, the defaults are ridiculously low anyway).