I compared gcc and llvm-gcc with -O3 option on hmmer and mcf in spec cpu2006 benchmark. Surprisingly, I found gcc beat llvm-gcc for both cases. Is it because the -O3 has differe
You seem surprised that gcc beat llvm on your benchmark. Phoronix hosts a bunch of interesting benchmarks in this area. For instance, have a look at:
(Lots of luvverly colours.)
As far as How should I establish the experiments to get a fair comparison? goes, presumably you should compare the fastest runtime, fastest compile time, lowest memory footprint, most operations per Watt and scalability over number of CPUs (you pay your money and take your choice), for the fastest configuration of each compiler against the fastest configuration of the other(s).
First off, you need to at least establish the variability of each progam—how repeatable the variables are for each run of a single program on your platform. (Yes, believable benchmarking requires thoroughness on your part.)