Is there a problem with IO.Compression?

放肆的年华 提交于 2019-12-10 18:33:54

问题


I've just started compressing file in VB.Net, using the following code. Since I'm targeting Fx 2.0, I can't use the Stream.CopyTo method.

My code, however, gives extremely poor results compared to the gzip Normal compression profile in 7-zip. For example, my code compressed a 630MB outlook archive to 740MB, and 7-zip makes it 490MB.

Here is the code. Is there a blatant mistake (or many?)

Using Input As New IO.FileStream(SourceFile, IO.FileMode.Open, IO.FileAccess.Read, IO.FileShare.Read)
    Using outFile As IO.FileStream = IO.File.Create(DestFile)
        Using Compress As IO.Compression.GZipStream = New IO.Compression.GZipStream(outFile, IO.Compression.CompressionMode.Compress)
            'TODO: Figure out the right buffer size.'
            Dim Buffer(524228) As Byte
            Dim ReadBytes As Integer = 0

            While True
                ReadBytes = Input.Read(Buffer, 0, Buffer.Length)
                If ReadBytes <= 0 Then Exit While
                Compress.Write(Buffer, 0, ReadBytes)
            End While
        End Using
    End Using
End Using

I've tried with multiple buffer sizes, but I get similar compression times, and exactly the same compression ratio.


回答1:


EDIT, or actually rewrite: It looks like the BCL coders decided to phone it in.

The implementation in System.dll version 2.0 uses statically defined, hardcoded Huffman trees optimized for plain ASCII text, rather than adaptively generating the Huffman trees as other implementations do. It also doesn't support stored-block optimization (which is how standard GZip/Deflate avoid runaway expansion). As a result, running any sort of file through their implementation other than plain text will result in a much larger file than the input, and Microsoft claims this is by design!

Save yourself some pain, grab a third party implementation.




回答2:


IO.Compression wasn't really made for us. It was created the support the XPS or XML Paper Specificatin. Currently you have to use a third party library if you want decent file compression.




回答3:


Some additional information that may be useful. I was compressing some static files (binary) to include in a project release and had the same issue where the file size increased with IO.Compression.GZipStream.

I decided to use Ionic.Zip instead where the best compression could be used.

One thing I noticed immediately is that even though Ionic.Zip reduced my files to 25% of there original size the Compressing Action was about 3-4 times slower (totally expected) but the unzip process was also 3 times slower which made the decompress take 1.6 seconds compared to 0.5 seconds.

Since the GZipStream is a standard, even though the built in IO.Compression.GZipStream in .NET was far less space efficient compressing, it was far faster decompressing.

So I use both Ionic.Zip Librarys "ZLib.GZipStream" to Compress the files and "IO.Compression.GZipStream" to Decompress the files much faster in production.



来源:https://stackoverflow.com/questions/4975182/is-there-a-problem-with-io-compression

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!