I\'m investigating a performance problem with Jetty 6.1.26. Jetty appears to use Transfer-Encoding: chunked
, and depending on the buffer size used, this can be very
I believe I have found the answer myself, by looking through the Jetty source code. It's actually a complex interplay of the response buffer size, the size of the buffer passed to outStream.write
, and whether or not outStream.flush
is called (in some situations). The issue is with the way Jetty uses its internal response buffer, and how the data you write to the output is copied to that buffer, and when and how that buffer is flushed.
If the size of the buffer used with outStream.write
is equal to the response buffer (I think a multiple also works), or less and outStream.flush
is used, then performance is fine. Each write
call is then flushed straight to the output, which is fine. However, when the write buffer is larger and not a multiple of the response buffer, this seems to cause some weirdness in how the flushes are handled, causing extra flushes, leading to bad performance.
In the case of chunked transfer encoding, there's an extra kink in the cable. For all but the first chunk, Jetty reserves 12 bytes of the response buffer to contain the chunk size. This means that in my original example with a 64KB write and response buffer, the actual amount of data that fit in the response buffer was only 65524 bytes, so again, parts of the write buffer were spilling into multiple flushes. Looking at a captured network trace of this scenario, I see that the first chunk is 64KB, but all subsequent chunks are 65524 bytes. In this case, outStream.flush
makes no difference.
When using a 4KB buffer I was seeing fast speeds only when outStream.flush
was called. It turns out that resp.setBufferSize
will only increase the buffer size, and since the default size is 24KB, resp.setBufferSize(4096)
is a no-op. However, I was now writing 4KB pieces of data, which fit in the 24KB buffer even with the reserved 12 bytes, and are then flushed as a 4KB chunk by the outStream.flush
call. However, when the call to flush
is removed, it will let the buffer fill up, again with 12 bytes spilling into the next chunk because 24 is a multiple of 4.
It seems that to get good performance with Jetty, you must either:
setContentLength
(no chunked transfer encoding) and use a buffer for write
that's the same size as the response buffer size.flush
after each write.Note that the performance of the "slow" scenario is still such that you'll likely only see the difference on the local host or very fast (1Gbps or more) network connection.
I guess I should file issue reports against Hadoop and/or Jetty for this.