问题
My goal is take source codes in different languages (mostly C, C++, Obj-C and Haskell) and tell every kind of statistics about them. (eg. number of variables, functions, memory allocations, complexity etc.)
LLVM seemed to be a perfect tool for this, because I can generate the bitcode for these languages and with LLVM's customizable passes I can almost do anything. For the C family it works fine, take a C program (test.c
) for example:
#include <stdio.h>
int main( )
{
int num1, num2, sum;
printf("Enter two integers: ");
scanf("%d %d", &num1, &num2);
sum = num1 + num2;
printf("Sum: %d",sum);
return 0;
}
Then I run:
clang -emit-llvm test.c -c -o test.bc
opt -load [MY AWESOME PASS] [ARGS]
Voila, I have almost everything I need:
1 instcount - Number of Add insts
4 instcount - Number of Alloca insts
3 instcount - Number of Call insts
3 instcount - Number of Load insts
1 instcount - Number of Ret insts
2 instcount - Number of Store insts
1 instcount - Number of basic blocks
14 instcount - Number of instructions (of all types)
12 instcount - Number of memory instructions
1 instcount - Number of non-external functions
I would like to achieve the same with Haskell programs. Take test.hs
:
module Test where
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs
However when I do
ghc -fllvm -keep-llvm-files -fforce-recomp test.hs
opt -load [MY AWESOME PASS] [ARGS]
I get the following results, which seem to be completely useless for my purposes (mentioned at the beginning of this post), because they are obviously not true for these few lines of code. I guess it has something to do with GHC, because the newly created .ll
file is 52Kb itself, while the .ll
file for the C program is only 2Kb.
31 instcount - Number of Add insts
92 instcount - Number of Alloca insts
2 instcount - Number of And insts
30 instcount - Number of BitCast insts
24 instcount - Number of Br insts
22 instcount - Number of Call insts
109 instcount - Number of GetElementPtr insts
17 instcount - Number of ICmp insts
54 instcount - Number of IntToPtr insts
326 instcount - Number of Load insts
65 instcount - Number of PtrToInt insts
22 instcount - Number of Ret insts
206 instcount - Number of Store insts
8 instcount - Number of Sub insts
46 instcount - Number of basic blocks
1008 instcount - Number of instructions (of all types)
755 instcount - Number of memory instructions
10 instcount - Number of non-external functions
My question is how should I proceed to be able to compare Haskell code with the others without having these huge numbers? Is it even possible? Should I continue using GHC for generating LLVM IR? What other tools should I use?
来源:https://stackoverflow.com/questions/33380547/generate-llvm-ir-from-haskell-code