jzlib

How to compute good preset dictionary for deflate compression

时光总嘲笑我的痴心妄想 提交于 2019-12-13 12:29:41
问题 I have an opportunity to preset dictionary for deflate compression. It makes sense in my case, because data to be compressed is relatively small 1kb-3kb and I have a large sample of representative examples. Data to be compressed consists of arbitrary sequence of bytes, so tokenization etc. is not a good way to go. Also, data shows a lot of repetition (between data examples), so good dictionary could potentially give very good results. The question is how calculate good dictionary? Is there an

Creating gzip file using jzlib

若如初见. 提交于 2019-12-10 00:20:21
问题 I am trying to create a gzip file using jzlib which is an open source. The Java GZIPOutputStream is a bit problemtic, the CPU got higher and never released. The problem with the JZlib that the file cannot be open by winrar, it seems like missing header. Any idea how to solve it? 回答1: Since JZlib 1.1.0, it has supported gzip file format. Try com.jcraft.jzlib.GZIPOutputStream class 回答2: This is skimmed version of the class; Hopefully it can serve. This version still retains proper flush()

How to compute good preset dictionary for deflate compression

穿精又带淫゛_ 提交于 2019-12-05 19:04:01
I have an opportunity to preset dictionary for deflate compression. It makes sense in my case, because data to be compressed is relatively small 1kb-3kb and I have a large sample of representative examples. Data to be compressed consists of arbitrary sequence of bytes, so tokenization etc. is not a good way to go. Also, data shows a lot of repetition (between data examples), so good dictionary could potentially give very good results. The question is how calculate good dictionary? Is there an algorithm which calculates optimal dictionary (given sample data)? I started looking at prefix trees,