I want to write a LLVM pass to instrument every memory access. Here is what I am trying to do.
Given any C/C++ program (like the one given below), I am trying to insert
Try something like this: ( you need to fill in the blanks and make the iterator loop work despite the fact that items are being inserted )
class ThePass : public llvm::BasicBlockPass {
public:
ThePass() : BasicBlockPass() {}
virtual bool runOnBasicBlock(llvm::BasicBlock &bb);
};
bool ThePass::runOnBasicBlock(BasicBlock &bb) {
bool retval = false;
for (BasicBlock::iterator bbit = bb.begin(), bbie = bb.end(); bbit != bbie;
++bbit) { // Make loop work given updates
Instruction *i = bbit;
CallInst * beforeCall = // INSERT THIS
beforeCall->insertBefore(i);
if (!i->isTerminator()) {
CallInst * afterCall = // INSERT THIS
afterCall->insertAfter(i);
}
}
return retval;
}
Hope this helps!
I'm not very familiar with LLVM, but I am a bit more familiar with GCC (and its plugin machinery), since I am the main author of GCC MELT (a high level domain specific language to extend GCC, which by the way you could use for your problem). So I will try to answer in general terms.
You should first know why you want to adapt a compiler (or a static analyzer). It is a worthwhile goal, but it does have drawbacks (in particular, w.r.t. redefining some operators or others constructs in your C++ program).
The main point when extending a compiler (be it GCC or LLVM or something else) is that you very probably should handle all its internal representation (and you probably cannot skip parts of it, unless you have a very narrow defined problem). For GCC it means to handle the more than 100 kinds of Tree-s and nearly 20 kinds of Gimple-s: in GCC middle end, the tree-s represent the operands and declarations, and the gimple-s represent the instructions. The advantage of this approach is that once you've done that, your extension should be able to handle any software acceptable by the compiler. The drawback is the complexity of compilers' internal representations (which is explainable by the complexity of the definitions of the C & C++ source languages accepted by the compilers, and by the complexity of the target machine code they are generating, and by the increasing distance between source & target languages).
So hacking a general compiler (be it GCC or LLVM), or a static analyzer (like Frama-C), is quite a big task (more than a month of work, not a few days). To deal only with a tiny C++ programs like you are showing, it is not worth it. But it is definitely worth the effort if you plain to deal with large source software bases.
Regards