I need to be able to search for text in a large number of files (.txt) that are zipped. Compression may be changed to something else or even became proprietary. I want to avoid
Searching for text in compressed files can be faster than searching for the same thing in uncompressed text files.
One compression technique I've seen that sacrifices some space in order to do fast searches:
In particular, searching for a single word usually reduces to comparing the 16-bit index in the compressed text, which is faster than searching for that word in the original text, because
Some kinds of regular expressions can be translated to another regular expression that directly finds items in the compressed file (and also perhaps also finds a few false positives). Such a search also does fewer comparisons than using the original regular expression on the original text file, because the compressed file is shorter, but typically each regular expression comparison requires more work, so it may or may not be faster than the original regex operating on the original text.
(In principle you could replace the fixed-length 16-bit codes with variable-length Huffman prefix codes, as rwong mentioned -- the resulting compressed file would be smaller, but the software to deal with those files would be a little slower and more complicated).
For more sophisticated techniques, you might look at