Summary
How can you make ant repeatedly generate byte-identical jar files from the same .class files?
Background
Ou
Had this same problem, landed on this page. The answer above by Jiri Patera was very helpful in understanding why I could not get the md5sums of what I expected to be two identical files to be the same after unsigning and resigning the jar files.
This is the solution I used instead:
jar -tvf $JARFILE | grep -v META-INF | perl -p -e's/^\s+(\d+).*\s+([\w]+)/$1 $2/g' | md5sum
It doesn't give 100% certainty that the jars are equivalent but it gives a fairly reliable indication.
It takes a listing of all the files in the jarfile minus the META_INF files, parses out file size and file name, and then runs the text of filesizes plus filenames thru the md5sum algorithm.
I have been facing a similar problem, yet slightly different. I decided to share it here as it relates to the topic of the question. In order to produce two byte-identical digitally signed JAR files in a different time one has to take the following points to consideration:
**/*.class
files have to have the same timestamp (java.util.zip.ZipEntry.setTime(long)
). In addition, the META-INF/MANIFEST.MF
file and the certificate files (*.RSA
, *.DSA
, and *.SF
) are added to the JAR file with a "now" timestamp. So even if you decide not to compile the classes and use the ones already compiled (i.e. the ones with the original JAR's timestamp), your resulting JAR will be binary different.
MANIFEST.MF
Entries Ordering: Note that the key-value pairs in the MANIFEST.MF
file are represented as a java.util.HashMap
which "does not guarantee that the order will remain constant over time."
. So you may run into another binary difference when signing the JAR files using JDK v5 and JDK v6 jarsigner
tool as the order of the MANIFEST.MF
entries may change (http://stackoverflow.com/questions/1879897/order-of-items-in-a-hashmap-differ-when-the-same-program-is-run-in-jvm5-vs-jvm6).
So basically there are two levels of the problem. Firstly, the JAR/ZIP tool that packages the files with their file-system timestamps and, thus, creates binary different JAR files for the same set of Java classes that are binary equal, but were compiled in a different time. Secondly, the JAR signer tool that modifies the META-INF/MANIFEST-MF
file and appends more files to the JAR archive (certificates and class file check-sums).
The solution maybe a custom JAR signer, that sets the timestamps of all the JAR file items to a constant time and orders the MANIFEST.MF
file entries (e.g. by alphabet). So far, this is, according to my knowledge, the only way to producing two byte-identical digitally signed JAR files in different time points.
Since a jar is just a zip file incognito, you could try using the zip
task to add the manifest file under META-INF/
by hand. Hopefully that circumvents any internal magic associated with handling the manifest by the jar task.
Just an side note, since it sounds like having equal MD5s is critical, I would recommend you add a sanity test as part of the build, such as compile some special "dummy" code that never changes into a jar and check the jar MD5 equals the one expected. This will safeguard the build against unexpected changes (e.g. after an upgrade to ant, JRE, OS, timezone change etc.)