Organizing files in tar bz2 file with python

[亡魂溺海] 提交于 2019-12-05 21:31:50

Do you have to use bzip2? Reading it's documentation, it's quite clear it's not designed to support random access. Perhaps you should use a compression format that more closely matches your requirements. The good old Zip format supports random access, but might compress worse, of course.

Bzip2 compresses in large blocks (900 KiB by default, I believe). One method that would speed up the scanning of the tar file dramatically, but would reduce compression performance, would be to compress each file individually and then tar the results together. This is essentially what Zip-format files are (though using zlib compression rather than bzip2). But you could then easily grab the tar index and only have to decompress the specific file(s) you are looking for.

I don't think most tar programs offer much ability to organize files in any meaningful way, though you could write a program to do this for your special case (I know Python has tar-writing libraries though I've only used them once or twice). However, you'd still have the problem of having to decompress most of the data before you found what you were looking for.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!