I would like to know why the .o file that we get from compiling a .c file that prints \"Hello, World!\" is larger than a Java .class file that also prints \"Hello, World!\"?
C programs, even though they're compiled to native machine code that runs on your processor (dispatched through the OS, of course), tend to need to do a lot of set up and tearing down for the operating system, loading dynamically-linked libraries like the C library, etc.
Java, on the other hand, compiles to bytecode for a virtual platform (basically a simulated computer-within-a-computer), which is specifically designed alongside Java itself, so a lot of this overhead (if it would even be necessary since both the code and the VM interface is well-defined) can be moved into the VM itself, leaving the program code to be lean.
It varies from compiler-to-compiler, though, and there are several options to reduce it or build code differently, which will have different effects.
All this said, it's not really that important.
Java uses Bytecode to be platform independent and "precompiled", but bytecode is used by interpreter and is served to be compact enough, so it is not the same that machine code which you can see in compiled C program. Just take a look at the full process of Java compilation:
Java program
-> Bytecode
-> High-level Intermediate Representation (HIR)
-> Middle-level Intermediate Representation (MIR)
-> Low-level Intermediate Representation (LIR)
-> Register allocation
-> EMIT (Machine Code)
this is the chain for Java Program to Machine code transformation. As you see bytecode is far away from machine code. I can't find in the Internet good stuff to show you this road on the real program (an example), everything I've found is this presentation, here you can see how each steps changes code presentation. I hope it answers you how and why compiled c program and Java bytecode are different.
UPDATE: All steps which are after "bytecode" are done by JVM in runtime depending on its decision to compile that code (that's another story... JVM is balancing between bytecode interpretation and its compiling to native platform dependent code)
Finally found good example, taken from Linear Scan Register Allocation for the Java HotSpot™ Client Compiler (btw good reading to understand what is going on inside JVM). Imagine that we have java program:
public static void fibonacci() {
int lo = 0;
int hi = 1;
while (hi < 10000) {
hi = hi + lo;
lo = hi - lo;
print(lo);
}
}
then its bytecode is:
0: iconst_0
1: istore_0 // lo = 0
2: iconst_1
3: istore_1 // hi = 1
4: iload_1
5: sipush 10000
8: if_icmpge 26 // while (hi < 10000)
11: iload_1
12: iload_0
13: iadd
14: istore_1 // hi = hi + lo
15: iload_1
16: iload_0
17: isub
18: istore_0 // lo = hi - lo
19: iload_0
20: invokestatic #12 // print(lo)
23: goto 4 // end of while-loop
26: return
each command takes 1 byte (JVM supports 256 commands, but in fact has less than that number) + arguments. Together it takes 27 bytes. I omit all stages, and here is ready to execute machine code:
00000000: mov dword ptr [esp-3000h], eax
00000007: push ebp
00000008: mov ebp, esp
0000000a: sub esp, 18h
0000000d: mov esi, 1h
00000012: mov edi, 0h
00000017: nop
00000018: cmp esi, 2710h
0000001e: jge 00000049
00000024: add esi, edi
00000026: mov ebx, esi
00000028: sub ebx, edi
0000002a: mov dword ptr [esp], ebx
0000002d: mov dword ptr [ebp-8h], ebx
00000030: mov dword ptr [ebp-4h], esi
00000033: call 00a50d40
00000038: mov esi, dword ptr [ebp-4h]
0000003b: mov edi, dword ptr [ebp-8h]
0000003e: test dword ptr [370000h], eax
00000044: jmp 00000018
00000049: mov esp, ebp
0000004b: pop ebp
0000004c: test dword ptr [370000h], eax
00000052: ret
it takes 83 (52 in hex + 1 byte) bytes in result.
PS. I don't take into account linking (was mentioned by others), as well as compiledc and bytecode file headers (probably they are different too; I don't know how is it with c, but in bytecode file all strings are moved to special header pool, and in program there is used its "position" in header etc.)
UPDATE2: Probably worth to mention, that java works with stack (istore/iload commands), though machine code based on x86 and most other platform works with registers. As you can see machine code is "full" of registers and that gives extra size to the compiled program in comparing with more simple stack-based bytecode.
In short: Java programs are compiled to Java byte code, which requires a separate interpreter (Java Virtual Machine) to be executed.
There is not a 100% guarantee that the .o file produced by the c-compiler is smaller, than the .class file produced by the Java compiler. It all depends of the implementation of the compiler.