Inspired by this question,
Now visible only for users with > 10k rep
I came up with the following code:
$cat loop.c
int main(
Optimization - you are at least missing the -O2
flag on the gcc
command line.
"What I'm missing here?" Optimization flags.
The Java JIT compiler is smart enough to optimize the loop away, while your C compiler seems to have most of the optimizations turned off.
So you are really comparing the time to start up the Java machine with the time it takes unoptimized C code to count to 2 billion.
I don't think this question really has an answer; it depends on the optimizations both compilers perform. In this case I expect either, if poked into sufficient optimization effort, would eliminate the loop entirely as i
is never used.
I expect javac
is defaulting to some higher level of optimization than your C compiler. When I compile with -O3
here, the C is way faster:
C with -O3
:
real 0m0.003s
user 0m0.000s
sys 0m0.002s
Your java program:
real 0m0.294s
user 0m0.269s
sys 0m0.051s
Some more details; without optimization, the C compiles to:
0000000100000f18 pushq %rbp
0000000100000f19 movq %rsp,%rbp
0000000100000f1c movl %edi,0xec(%rbp)
0000000100000f1f movq %rsi,0xe0(%rbp)
0000000100000f23 movl $0x00000000,0xfc(%rbp)
0000000100000f2a incl 0xfc(%rbp)
0000000100000f2d movl $0x80000000,%eax
0000000100000f32 cmpl %eax,0xfc(%rbp)
0000000100000f35 jne 0x00000f2a
0000000100000f37 movl $0x00000000,%eax
0000000100000f3c leave
0000000100000f3d ret
With optimization (-O3
), it looks like this:
0000000100000f30 pushq %rbp
0000000100000f31 movq %rsp,%rbp
0000000100000f34 xorl %eax,%eax
0000000100000f36 leave
0000000100000f37 ret
As you can see, the entire loop has been removed. javap -c Loop
gave me this output for the java bytecode:
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iload_1
3: iinc 1, 1
6: ldc #2; //int 2147483647
8: if_icmpge 14
11: goto 2
14: return
}
It appears the loop is compiled in, I guess something happens at runtime to speed that one up. (As others have mentioned, the JIT compiler squashes out the loop.)
My guess is that the JIT is optimizing away the empty loop.
Update: The Java Performance Tuning article Followup to Empty Loop Benchmark seems to support that, along with the other answers here that point out that the C code needs to also be optimized in order to make a meaningful comparison. Key quote:
Had I chosen to use the client mode 1.4.1 JVM (client is the default mode), the loops would not be optimized away. Had I chosen to use Microsoft's C++ compiler, the C version would take no time. Clearly, the choice of compiler is critical.