I want to read (parse) LLVM IR code (which is saved in a text file) and add some of my own code to it. I need some example of doing this, that is, how this is done by using
As mentioned above the best way it to write a pass. But if you want to simply iterate through the instructions and do something with the LLVM provided an InstVisitor class. It is a class that implements the visitor pattern for instructions. It is very straight forward to user, so if you want to avoid learning how to implement a pass, you could resort to that.
This is usually done by implementing an LLVM pass/transform. This way you don't have to parse the IR at all because LLVM will do it for you and you will operate on a object-oriented in-memory representation of the IR.
This is the entry point for writing an LLVM pass. Then you can look at any of the already implemented standard passes that come bundled with LLVM (look into lib/Transforms).
The Opt tool takes llvm IR code, runs a pass on it, and then spits out transformed llvm IR on the other side.
The easiest to start hacking is lib\Transforms\Hello\Hello.cpp. Hack it, run through opt with your source file as input, inspect output.
Apart from that, the docs for writing passes is really quite good.
First, to fix an obvious misunderstanding: LLVM is a framework for manipulating code in IR format. There are no ASTs in sight (*) - you read IR, transform/manipulate/analyze it, and you write IR back.
Reading IR is really simple:
int main(int argc, char** argv)
{
if (argc < 2) {
errs() << "Expected an argument - IR file name\n";
exit(1);
}
LLVMContext &Context = getGlobalContext();
SMDiagnostic Err;
Module *Mod = ParseIRFile(argv[1], Err, Context);
if (!Mod) {
Err.print(argv[0], errs());
return 1;
}
[...]
}
This code accepts a file name. This should be an LLVM IR file (textual). It then goes on to parse it into a Module
, which represents a module of IR in LLVM's internal in-memory format. This can then be manipulated with the various passes LLVM has or you add on your own. Take a look at some examples in the LLVM code base (such as lib/Transforms/Hello/Hello.cpp
) and read this - http://llvm.org/docs/WritingAnLLVMPass.html.
Spitting IR back into a file is even easier. The Module
class just writes itself to a stream:
some_stream << *Mod;
That's it.
Now, if you have any specific questions about specific modifications you want to do to IR code, you should really ask something more focused. I hope this answer shows you how to parse IR and write it back.
(*) IR doesn't have an AST representation inside LLVM, because it's a simple assembly-like language. If you go one step up, to C or C++, you can use Clang to parse that into ASTs, and then do manipulations at the AST level. Clang then knows how to produce LLVM IR from its AST. However, you do have to start with C/C++ here, and not LLVM IR. If LLVM IR is all you care about, forget about ASTs.
The easiest way to do this is to look at one of the existing tools and steal code from it. In this case, you might want to look at the source for llc. It can take either a bitcode or .ll file as input. You can modify the input file any way you want and then write out the file using something similar to the code in llvm-dis if you want a text file.