I have a self-hosted WCF service (v4 framework) that is exposed through a HttpTransport
-based custom binding. The binding uses a custom MessageEncoder
Buffered: it needs to put the entire file in memory before uploading/downloading. this is approach is very useful for transferring small files, securely.
Streamed: the file can be transferred in the form of chunks.
As you use 'GZipMessageEncodingBindingElement', I assume you are using the MS GZIP sample.
Have a look at DecompressBuffer()
in GZipMessageEncoderFactory.cs and you will understand what's going on in buffered mode.
For the sake of example, let's say you have a message of uncompressed size 50M, compressed size 25M.
DecompressBuffer will receive an 'ArraySegment buffer' param of (1) 25M size. The method will then create a MemoryStream, uncompress the buffer into it, using (2) 50M. Then it will do a MemoryStream.ToArray(), copying the memory stream buffer into a new (3) 50M big byte array. Then it takes another byte array from the BufferManager of AT LEAST (4) 50M+, in reality, it can be a lot more - in my case it was always 67M for a 50M array.
At the end of DecompressBuffer, (1) will be returned to the BufferManager (which seems to never get cleared by WCF), (2) and (3) are subject to GC (which is async, and if you are faster than the GC, you might get OOM exceptions even though there would be enough mem if cleaned up). (4) will presumably be given back to the BufferManager in your BinaryMessageEncodingBindingElement.ReadMessage().
To sum up, for your 50M message, your buffered scenario will temporarily take up 25 + 50 + 50 + e.g. 65 = 190M memory, some of it subject to asynchronous GC, some of it managed by the BufferManager, which - worst case - means it keeps lots of unused arrays in memory that are neither usable in a subsequent request (e.g. too small) nor eligible for GC. Now imagine you have multiple concurrent requests, in that case BufferManager will create separate buffers for all concurrent requests, which will never be cleaned up, unless you manually call BufferManager.Clear(), and I don't know of a way to do that with the buffer managers used by WCF, see also this question: How can I prevent BufferManager / PooledBufferManager in my WCF client app from wasting memory? ]
Update: After migrating to IIS7 Http Compression ( wcf conditional compression) memory consumption, cpu load and startup time dropped (don't have the numbers handy) and then migrating from buffered to streamed TransferMode ( How can I prevent BufferManager / PooledBufferManager in my WCF client app from wasting memory?) memory consumption of my WCF client app has dropped from 630M (peak) / 470M (continuous) to 270M (both peak and continuous)!
I've had some experience with WCF and streaming.
Basically, if you don't set the TransferMode
to streamed, then it'll default to buffered. So if you are sending large pieces of data, it's going to build up the data on your end in memory and then send it once all the data is loaded and ready to be sent. This is why you were getting out of memory errors because the data was very large and more than your machine's memory.
Now if you use streamed, then it'll immediately start sending chunks of data to the other endpoint instead of buffering it up, making memory usage very minimal.
But this doesn't mean that the receiver also has to be set up for streaming. They could be setup to buffer and will experience the same problem as the sender did if they do not have sufficient memory for your data.
For the best results, both endpoints should be setup to handle streaming (for large data files).
Typically, for streaming, you use MessageContracts
instead of DataContracts
because it gives you more control over the SOAP structure.
See these MSDN articles on MessageContracts and Datacontracts for more info. And here is more info about Buffered vs Streamed.
I think (and I might be wrong) that, restricting users to just a Stream
parameter in operation contracts which use Streamed
transfer mode, comes from the fact that WCF puts stream data in the body section of SOAP message and starts to transfer it as the user starts reading the stream. So, I think it would have been difficult for them to multiplex arbitrary number of streams in a single data flow. e.g, suppose you have an operation contract with 3 stream parameters and three different threads on the client start to read from these three streams. How could you do that without using some algorithm and extra programming to multiplex these three different data flows (which WCF lacks right now)
As for your other question, it's hard to tell what is actually going on without seeing your complete code, but I think by using gzip, you are actually compressing all the message data into a byte array, handing it over to WCF and on the client side, when client asks for the SOAP message, the underlying channel opens a stream to read message and WCF channel for streamed transfer, starts streaming data as it was the Body of the message.
Anyway, you should note that setting MessageBodyMember
attribute just tells WCF that this member should be streamed as the SOAP body, but when you use custom encoder and binding, it is mostly your choice what the outgoing message will look like.