I am using zlib to compress a stream of text data. The text data comes in chunks, and for each chunk, deflate()
is called, with flush set to Z_NO_FLUSH
While looking at the sources for a hint, I fell over
/* =========================================================================
* Flush as much pending output as possible. All deflate() output goes
* through this function so some applications may wish to modify it
* to avoid allocating a large strm->next_out buffer and copying into it.
* (See also read_buf()).
*/
local void flush_pending(strm)
z_streamp strm;
{
unsigned len = strm->state->pending;
...
tracing the use of void flush_pending() throughout deflate() shows, that an upper bound on the needed output buffer in the middle of the stream is
strm->state->pending + deflateBound(strm, strm->avail_in)
the first part accounts for data still in the pipe from previous calls to deflate(), the second part accounts for the not-yet processed data of length avail_in.
deflateBound() is helpful only if you do all of the compression in a single step, or if you force deflate to compress all of the input data currently available to it and emit compressed data for all of that input. You would do that with a flush parameter such as Z_BLOCK, Z_PARTIAL_FLUSH, etc.
If you want to use Z_NO_FLUSH, then it becomes far more difficult as well as inefficient to attempt to predict the largest amount of output deflate() might emit on the next call. You don't know how much of the input was consumed at the time the last burst of compressed data was emitted, so you need to assume almost none of it, with the buffer size growing unnecessarily. However you attempt to estimate the maximum output, you will be doing a lot of unnecessary mallocs or reallocs for no good reason, which is inefficient.
There is no point to avoid calling deflate() for more output. If you simply loop on deflate() until it has no more output for you, then you can use a fixed output buffer malloced once. That is how the deflate() and inflate() interface was designed to be used. You can look at http://zlib.net/zlib_how.html for a well-documented example of how to use the interface.
By the way, there is a deflatePending() function in the latest version of zlib (1.2.6) that lets you know how much output deflate() has waiting to deliver.