When using the same JDK (i.e. the same javac
executable), are the generated class files always identical? Can there be a difference depending on the
I would put it another way.
First, I think the question is not about being deterministic:
Of course it is deterministic: randomness is hard to achieve in computer science, and there is no reason a compiler would introduce it here for any reason.
Second, if you reformulate it by "how similar are bytecode files for a same sourcecode file ?", then No, you can't rely on the fact that they will be similar.
A good way of making sure of this, is by leaving the .class (or .pyc in my case) in your git stage. You'll realize that among different computers in your team, git notices changes between .pyc files, when no changes were brought to the .py file (and .pyc recompiled anyway).
At least that's what I observed. So put *.pyc and *.class in your .gitignore !
Most probably, the answer is "yes", but to have precise answer, one does need to search for some keys or guid generation during compiling.
I can't remember the situation where this occurs. For example to have ID for serializing purposes it is hardcoded, i.e. generated by programmer or IDE.
P.S. Also JNI can matter.
P.P.S. I found that javac
is itself written in java. This means that it is identical on different platforms. Hence it would not generate different code without a reason. So, it can do this only with native calls.
Let's put it this way:
I can easily produce an entirely conforming Java compiler that never produces the same .class
file twice, given the same .java
file.
I could do this by tweaking all kinds of bytecode construction or by simply adding superfluous attributes to my method (which is allowed).
Given that the specification does not require the compiler to produce byte-for-byte identical class files, I'd avoid depending such a result.
However, the few times that I've checked, compiling the same source file with the same compiler with the same switches (and the same libraries!) did result in the same .class
files.
Update: I've recently stumbled over this interesting blog post about the implementation of switch on String in Java 7. In this blog post, there are some relevant parts, that I'll quote here (emphasis mine):
In order to make the compiler's output predictable and repeatable, the maps and sets used in these data structures are
LinkedHashMap
s andLinkedHashSet
s rather than justHashMaps
andHashSets
. In terms of functional correctness of code generated during a given compile, usingHashMap
andHashSet
would be fine; the iteration order does not matter. However, we find it beneficial to havejavac
's output not vary based on implementation details of system classes .
This pretty clearly illustrates the issue: The compiler is not required to act in a deterministic manner, as long as it matches the spec. The compiler developers, however, realize that it's generally a good idea to try (provided it's not too expensive, probably).
There is no obligation for the compilers to produce the same bytecode on each platform. You should consult the different vendors' javac
utility to have a specific answer.
I will show a practical example for this with file ordering.
Let's say that we have 2 jar files: my1.jar
and My2.jar
. They're put in the lib
directory, side-by-side. The compiler reads them in alphabetical order (since this is lib
), but the order is my1.jar
, My2.jar
when the file system is case insensitive , and My2.jar
, my1.jar
if it is case sensitive.
The my1.jar
has a class A.class
with a method
public class A {
public static void a(String s) {}
}
The My2.jar
has the same A.class
, but with different method signature (accepts Object
):
public class A {
public static void a(Object o) {}
}
It is clear that if you have a call
String s = "x";
A.a(s);
it will compile a method call with different signature in different cases. So, depending on your filesystem case sensitiveness, you will get different class as a result.
I believe that, if you use the same JDK, the generated byte code will always be the same, without relation with the harware and OS used. The byte code production is done by the java compiler, that uses a deterministic algorithm to "transform" the source code into byte code. So, the output will always be the same. In these conditions, only a update on the source code will affect the output.
There are two questions.
Can there be a difference depending on the operating system or hardware?
This is a theoretical question, and the answer is clearly, yes, there can be. As others have said, the specification does not require the compiler to produce byte-for-byte identical class files.
Even if every compiler currently in existence produced the same byte code in all circumstances (different hardware, etc.), the answer tomorrow might be different. If you never plan to update javac or your operating system, you could test that version's behavior in your particular circumstances, but the results might be different if you go from, for example, Java 7 Update 11 to Java 7 Update 15.
What are the circumstances where the same javac executable, when run on a different platform, will produce different bytecode?
That's unknowable.
I don't know if configuration management is your reason for asking the question, but it's an understandable reason to care. Comparing byte codes is a legitimate IT control, but only to determine if the class files changed, not top determine if the source files did.