Is the creation of Java class files deterministic?

前端 未结 11 1538
抹茶落季
抹茶落季 2020-12-02 11:04

When using the same JDK (i.e. the same javac executable), are the generated class files always identical? Can there be a difference depending on the

相关标签:
11条回答
  • 2020-12-02 11:21

    I would put it another way.

    First, I think the question is not about being deterministic:

    Of course it is deterministic: randomness is hard to achieve in computer science, and there is no reason a compiler would introduce it here for any reason.

    Second, if you reformulate it by "how similar are bytecode files for a same sourcecode file ?", then No, you can't rely on the fact that they will be similar.

    A good way of making sure of this, is by leaving the .class (or .pyc in my case) in your git stage. You'll realize that among different computers in your team, git notices changes between .pyc files, when no changes were brought to the .py file (and .pyc recompiled anyway).

    At least that's what I observed. So put *.pyc and *.class in your .gitignore !

    0 讨论(0)
  • 2020-12-02 11:25

    Most probably, the answer is "yes", but to have precise answer, one does need to search for some keys or guid generation during compiling.

    I can't remember the situation where this occurs. For example to have ID for serializing purposes it is hardcoded, i.e. generated by programmer or IDE.

    P.S. Also JNI can matter.

    P.P.S. I found that javac is itself written in java. This means that it is identical on different platforms. Hence it would not generate different code without a reason. So, it can do this only with native calls.

    0 讨论(0)
  • 2020-12-02 11:26

    Let's put it this way:

    I can easily produce an entirely conforming Java compiler that never produces the same .class file twice, given the same .java file.

    I could do this by tweaking all kinds of bytecode construction or by simply adding superfluous attributes to my method (which is allowed).

    Given that the specification does not require the compiler to produce byte-for-byte identical class files, I'd avoid depending such a result.

    However, the few times that I've checked, compiling the same source file with the same compiler with the same switches (and the same libraries!) did result in the same .class files.

    Update: I've recently stumbled over this interesting blog post about the implementation of switch on String in Java 7. In this blog post, there are some relevant parts, that I'll quote here (emphasis mine):

    In order to make the compiler's output predictable and repeatable, the maps and sets used in these data structures are LinkedHashMaps and LinkedHashSets rather than just HashMaps and HashSets. In terms of functional correctness of code generated during a given compile, using HashMap and HashSet would be fine; the iteration order does not matter. However, we find it beneficial to have javac's output not vary based on implementation details of system classes .

    This pretty clearly illustrates the issue: The compiler is not required to act in a deterministic manner, as long as it matches the spec. The compiler developers, however, realize that it's generally a good idea to try (provided it's not too expensive, probably).

    0 讨论(0)
  • 2020-12-02 11:33

    There is no obligation for the compilers to produce the same bytecode on each platform. You should consult the different vendors' javac utility to have a specific answer.


    I will show a practical example for this with file ordering.

    Let's say that we have 2 jar files: my1.jar and My2.jar. They're put in the lib directory, side-by-side. The compiler reads them in alphabetical order (since this is lib), but the order is my1.jar, My2.jar when the file system is case insensitive , and My2.jar, my1.jar if it is case sensitive.

    The my1.jar has a class A.class with a method

    public class A {
         public static void a(String s) {}
    }
    

    The My2.jar has the same A.class, but with different method signature (accepts Object):

    public class A {
         public static void a(Object o) {}
    }
    

    It is clear that if you have a call

    String s = "x"; 
    A.a(s); 
    

    it will compile a method call with different signature in different cases. So, depending on your filesystem case sensitiveness, you will get different class as a result.

    0 讨论(0)
  • 2020-12-02 11:34

    I believe that, if you use the same JDK, the generated byte code will always be the same, without relation with the harware and OS used. The byte code production is done by the java compiler, that uses a deterministic algorithm to "transform" the source code into byte code. So, the output will always be the same. In these conditions, only a update on the source code will affect the output.

    0 讨论(0)
  • 2020-12-02 11:36

    There are two questions.

    Can there be a difference depending on the operating system or hardware? 
    

    This is a theoretical question, and the answer is clearly, yes, there can be. As others have said, the specification does not require the compiler to produce byte-for-byte identical class files.

    Even if every compiler currently in existence produced the same byte code in all circumstances (different hardware, etc.), the answer tomorrow might be different. If you never plan to update javac or your operating system, you could test that version's behavior in your particular circumstances, but the results might be different if you go from, for example, Java 7 Update 11 to Java 7 Update 15.

    What are the circumstances where the same javac executable, when run on a different platform, will produce different bytecode?
    

    That's unknowable.

    I don't know if configuration management is your reason for asking the question, but it's an understandable reason to care. Comparing byte codes is a legitimate IT control, but only to determine if the class files changed, not top determine if the source files did.

    0 讨论(0)
提交回复
热议问题