The main culprit in the "long startup time" is dynamic linking. A Java application consists of compiled classes. Each class references other classes (for argument types, method invocations...) by name. The JVM must examine and match those names upon startup. It does so incrementally, doing only the parts that it needs at any given time, but that is still some work to do.
In a C application, that linking phase occurs at the end of the compilation. It is slow, especially for big applications, but only the developer sees it. Linking yields an executable file which the OS simply has to load in RAM "as is".
In Java, linking occurs every single time that the application is run. Hence the long startup time.
Various optimizations have been applied, including caching techniques, and computers get faster (and they get "more faster" than applications get "more bigger"), so the problem importance has much reduced lately; but the old prejudice remains.
As for performance afterwards, my own benchmarks on compact computations with array accesses (mostly hash functions and other cryptographic algorithms) usually show that optimized C code is about 3x faster than Java code; sometimes C is only 30% faster than Java, sometimes C can be 4x faster, depending on the implemented algorithm. I saw a 10x factor when the "C" code was actually assembly for big integer arithmetics, due to the 64x64->128 multiplication opcodes that the processor offers but Java cannot use because its longest integer type is the 64-bit long
. This is an edge case. Under practical conditions, I/O and memory bandwidth considerations prevent C code from being really three times faster than Java.