data-compression | 易学教程

Store SHA-1 in database in less space than the 40 hex digits

阅读更多关于 Store SHA-1 in database in less space than the 40 hex digits

问题 I am using a hash algorithm to create a primary key for a database table. I use the SHA-1 algorithm which is more than fine for my purposes. The database even ships an implementation for SHA-1. The function computing the hash is returning a hex value as 40 characters. Therefore I am storing the hex characters in a char(40) column. The table will have lots of rows, >= 200 Mio. rows which is why I am looking for less data intensive ways of storing the hash. 40 characters times ~200 Mio. rows

Is Terra Compression possible? If so, please explain and provide samples

阅读更多关于 Is Terra Compression possible? If so, please explain and provide samples

问题 Long Ascii String Text may or may not be crushed and compressed into hash kind of ascii "checksum" by using sophisticated mathematical formula/algorithm. Just like air which can be compressed. To compress megabytes of ascii text into a 128 or so bytes, by shuffling, then mixing new "patterns" of single "bytes" turn by turn from the first to the last. When we are decompressing it, the last character is extracted first, then we just go on decompression using the formula and the sequential keys

Converting string of 1s and 0s into binary value, then compress afterwards ,PHP

阅读更多关于 Converting string of 1s and 0s into binary value, then compress afterwards ,PHP

问题 I have a string for example: "10001000101010001" in PHP I am compressing it with gzcompress, but it compresses ASCII equivalent. I would like to compress the string as if it were binary data not it ASCII binary equivalent. Bascially I have 2 problems: how to convert a list of 1s and 0s into binary compress the resulting binary with gzcompress thanks in advance. 回答1: Take a look at the bindec() function. Basically you'll want something like (dry-coded, please test it yourself before blindly

Redshift copy creates different compression encodings from analyze

阅读更多关于 Redshift copy creates different compression encodings from analyze

问题 I've noticed that AWS Redshift recommends different column compression encodings from the ones that it automatically creates when loading data (via COPY) to an empty table. For example, I have created a table and loaded data from S3 as follows: CREATE TABLE Client (Id varchar(511) , ClientId integer , CreatedOn timestamp, UpdatedOn timestamp , DeletedOn timestamp , LockVersion integer , RegionId varchar(511) , OfficeId varchar(511) , CountryId varchar(511) , FirstContactDate timestamp ,

How to read data from a zip file without having to unzip the entire file

阅读更多关于 How to read data from a zip file without having to unzip the entire file

问题 Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file? Simply I possibly want to extract data (file) from the start of a zip file, obviously this depends if the compression algorithm compress the file in a deterministic order. 回答1: DotNetZip is your friend here. As easy as: using (ZipFile zip = ZipFile.Read(ExistingZipFile)) { ZipEntry e = zip["MyReport.doc"]; e.Extract(OutputStream); } (you can also extract to a file or other destinations).

How to read data from a zip file without having to unzip the entire file

阅读更多关于 How to read data from a zip file without having to unzip the entire file

How do I compress an image with Run-Length Encoding using C#?

阅读更多关于 How do I compress an image with Run-Length Encoding using C#?

问题 How do I compress an image with Run-Length Encoding using C#? Are there any available libraries to support this? Does Run-Length Encoding only work on bitmapped images? If so, How will I convert image types to bitmap using C#? I'd also like to ask what's their resulting file type after this, will they retain their file type or will they have a new one? 回答1: I know this is an old question, but it is one of the few things that comes up for RLE compression in C# on Google search. For someone

Big file compression with python

阅读更多关于 Big file compression with python

问题 I want to compress big text files with python (I am talking about >20Gb files). I am not any how an expert so I tried to gather the info I found and the following seems to work : import bz2 with open('bigInputfile.txt', 'rb') as input: with bz2.BZ2File('bigInputfile.txt.bz2', 'wb', compresslevel = 9) as output: while True: block = input.read(900000) if not block: break output.write(block) input.close() output.close() I am wondering if this syntax is correct and if there is a way to optimize

Unable to compress file during Huffman Encoding in Java

阅读更多关于 Unable to compress file during Huffman Encoding in Java

问题 I have implemented the Huffman Encoding Algorithm in Java using Priority Queues where I traverse the Tree from Root to Leaf and get encoding example as #=000011 based on the number of times the symbol appears in the input. Everything is fine, the tree is being built fine, encoding is just as expected: But the output file I am getting is bigger size than the original file. I am currently appending '0' & '1' to a String on traversing left node and right node of the tree. Probably what I end up

Write a program that takes text as input and produces a program that reproduces that text

阅读更多关于 Write a program that takes text as input and produces a program that reproduces that text

问题 Recently I came across one nice problem, which turned up as simple to understand as hard to find any way to solve. The problem is: Write a program, that reads a text from input and prints some other program on output. If we compile and run the printed program, it must output the original text. The input text is supposed to be rather large (more than 10000 characters). The only (and very strong) requirement is that the size of the archive (i.e. the program printed) must be strictly less than