In honor of the Hutter Prize, what are the top algorithms (and a quick description of each) for text compression?
Note: The intent of this question is to get a descript
If you want to use PAQ as a program, you can install the zpaq
package on debian-based systems. Usage is (see also man zpaq
)
zpaq c archivename.zpaq file1 file2 file3
Compression was to about 1/10th of a zip file's size. (1.9M vs 15M)
The boundary-pushing compressors combine algorithms for insane results. Common algorithms include:
Maximum Compression is a pretty cool text and general compression benchmark site. Matt Mahoney publishes another benchmark. Mahoney's may be of particular interest because it lists the primary algorithm used per entry.
There's always lzip.
All kidding aside:
DEFLATE
algorithm) still wins.LZMA
algorithm) compresses very well and is available for under the LGPL. Few operating systems ship with built-in support, however.