Why do the md5 hashes of two tarballs of the same file differ?

前端 未结 2 1288
隐瞒了意图╮
隐瞒了意图╮ 2020-12-30 08:00

I can run:

echo \"asdf\" > testfile
tar czf a.tar.gz testfile
tar czf b.tar.gz testfile
md5sum *.tar.gz

and it turns out that a.ta

相关标签:
2条回答
  • 2020-12-30 08:34

    For MacOS:

    In man tar we can look at --options section and there we will find !timestamp option, which will exclude timestamp from our gzip archive. Usage:

    tar --options '!timestamp' -cvzf archive.tgz filename
    

    It will produce same md5 sum for same files with same names

    0 讨论(0)
  • 2020-12-30 08:45

    tar czf outfile infiles is equivalent to

    tar cf - infiles | gzip > outfile
    

    The reason the files are different is because gzip puts its input filename and modification time into the compressed file. When the input is a pipe, it uses an empty string as the filename and the current time as the modification time.

    But it also has a --no-name option, which tells it not to put the name and timestamp into the file. So if you write the expanded command explicitly, instead of using the -z option to tar, you can make use of this option.

    tar cf - testfile | gzip --no-name > a.tar.gz
    tar cf - testfile | gzip --no-name > b.tar.gz
    

    I tested this on OS X 10.6.8 and it works.

    0 讨论(0)
提交回复
热议问题