I wrote the following programs to compare the speed of python with c/fortran. To get the time used by the programs I used the \"time\" command. All the programs compute the sq
I have recently done a similar test with a more realistic real-world algorithm. It involves numpy, Matlab, FORTRAN and C# (via ILNumerics). Without specific optimizations, numpy appears to generate much less efficient code than the others. Of course - as always - this can only suggest a general trend. You will be able to write FORTRAN code which at the end runs slower than a corresponding numpy implementation. But most the time, numpy will be much slower. Here the (averaged) results of my test:
In order to time such simple floating point operations as in your example, all comes down to the compilers ability to generate 'optimal' machine instructions. Here, it is not so important, how many compilation steps are involved. .NET and numpy utilize more than one step by first compiling to byte code which than executes in a virtual machine. But the options to optimize the result does equally exist - in theory. In praxis, modern FORTRAN and C compiler are better in optimizing for execution speed. As one example they utilize floating point extensions (SSE, AVX) and do better loop unrolling. numpy (or better CPython, which is mostly used by numpy) seems to perform worse at this point. If you want to ensure, which framework is best for your task, you may attach to a debugger and investigate the final machine instructions of the executable.
However, keep in mind, in a more realistic scenario the floating point performance is only important at the very end of a large optimization chain. The difference is often masked by a much stronger effect: memory bandwith. As soon as you start handling arrays (wich is common in most scientific applications) you will have to take the cost of memory management into account. Frameworks deviate in supporting the algorithm author in writing memory efficient algorithms. In my opinion numpy makes it harder to write memory efficient algorithms then FORTRAN or C. But it is not easy in any of thoses languages. (ILNumerics improves this considerably.)
Another important point is parallelization. Does the framework supports you in executing your computations in parallel? And how efficient is it done? Again my personal opinion: neither C nor FORTRAN nor numpy make it easy to parallelize your algorithms. But FORTRAN and C at least give you the chance to do so, even if it sometimes require to use special compilers. Other frameworks (ILNumerics, Matlab) do parallelize automatically.
If you are in need of 'peak performance' for very small but costly algorithms you will mostly better off using FORTRAN or C. Just because they at the end generate better machine code (on a uniprocessor system). However, writing larger algorithms in C or FORTRAN and taking memory efficiency and parallelism into account often gets cumbersome. Here, higher level languages (like numpy, ILNumerics or Matlab) outdo lower level languages. And if done right - the difference in execution speed often is negligible. Unfortunately, this is often not true for the case of numpy.