Fail-safe way of round-tripping JVM byte-code to text-representation and back

岁酱吖の 提交于 2019-12-06 08:38:21

问题


I'm looking for a fail-safe way to round-trip between a JVM class file and a text representation and back again.

One strict requirement is that the resulting round-tripped JVM class file is exactly functionally equivalent to the original JVM class file as long as the text representation is left unchanged.

Furthermore, the text representation must be human-readable and editable. It should be possible to make small changes to the the text representation (such as changing a text string or a class name, etc.) which are reflected in the resulting class file representation.

The simplest solution would be to use a Java decompiler such as JAD to generate the text representation, which in this case would simply be the re-created Java source code. And then use javac to generate the byte-code. However, given the state of the free Java decompilers this approach does not work under all circumstances. It is rather easy to create obfuscated byte-code that does not survive a full round-trip class-file/java-source/class-file (in part because there simply isn't a 1:1 mapping between JVM byte-code and Java source code).

Is there a fail-safe way to achieve JVM class-file/text-representation/class-file round-tripping given the requirements above?

Update: Before answering - save time and effort by reading all the requirements above, and note specifically:

  • "Text-representation of JVM bytecode" does not necessarily mean "Java source-code".

回答1:


The BCEL project provides a JasminVisitor which will convert class files into jasmin assembly.

This can be modified and then reassembled into class files. If no edits are made and the versions are kept compatible the the round trip should result in identical class files except that line number mapping may be lost. If you require a bit for bit identical copy for the round trip case you will likely need to alter the tool to take aspects of the code which are pure meta data as well.

jasmin is rather old and is not designed with ease of actually writing full blown programs in assembly but for modifying string constant tables and constants it should be more than adequate.




回答2:


Jasmin and Kimera?




回答3:


Looks like ASM does this. (This is the same sort of answer as ShuggyCoUk's, but with a different tool.) Jarjar says it uses ASM for exactly the sort of thing you're talking about.




回答4:


I've written a tool that's designed for exactly this.

The Krakatau disassembler and assembler is designed to handle any valid classfile, no matter how bizarre. It uses an assembly format based on the Jasmin format, but extended to support all the classfile features that Jasmin can't handle. It even supports some of the obscure or undocumented 'features' of Hotspot, such as pre 45.3 classfiles using smaller widths for the Code attribute fields.

It can roundtrip any classfile I know of. The result won't be identical binary wise, but it will have the same functionality (constant pool entries may be rearranged for instance).

Update: Krakatau now supports exact binary roundtripping of classfiles. Passing the -roundtrip flag will preserve the order of constant pool entries, etc.




回答5:


No. There exists valid byte-code without a corresponding Java program.

The Soot project has a quite sophisticated decompiler- http://www.sable.mcgill.ca/dava/ - which may be useful for those byte codes coming from a Java compiler. It is, however, not perfect.

Your best bet is still getting the source code for the class files.



来源:https://stackoverflow.com/questions/1451016/fail-safe-way-of-round-tripping-jvm-byte-code-to-text-representation-and-back

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!