I\'m finding massive performance differences between similar code in C anc C#.
The C code is:
#include
#include
#inclu
my first guess is a compiler optimization because you never use root. You just assign it, then overwrite it again and again.
Edit: damn, beat by 9 seconds!
Since you never use 'root', the compiler may have been removing the call to optimize your method.
You could try to accumulate the square root values into an accumulator, print it out at the end of the method, and see what's going on.
Edit : see Jalf's answer below
Whatever the time diff. may be, that "elapsed time" is invalid. It would only be a valid one if you can guarantee that both programs run under the exact same conditions.
Maybe you should try a win. equivalent to $/usr/bin/time my_cprog;/usr/bin/time my_csprog
Actually guys, the loop is NOT being optimized away. I compiled John's code and examined the resulting .exe. The guts of the loop are as follows:
IL_0005: stloc.0
IL_0006: ldc.i4.0
IL_0007: stloc.1
IL_0008: br.s IL_0016
IL_000a: ldloc.1
IL_000b: conv.r8
IL_000c: call float64 [mscorlib]System.Math::Sqrt(float64)
IL_0011: pop
IL_0012: ldloc.1
IL_0013: ldc.i4.1
IL_0014: add
IL_0015: stloc.1
IL_0016: ldloc.1
IL_0017: ldc.i4 0x5f5e100
IL_001c: ble.s IL_000a
Unless the runtime is smart enough to realize the loop does nothing and skips it?
Edit: Changing the C# to be:
static void Main(string[] args)
{
DateTime startTime = DateTime.Now;
double root = 0.0;
for (int i = 0; i <= 100000000; i++)
{
root += Math.Sqrt(i);
}
System.Console.WriteLine(root);
TimeSpan runTime = DateTime.Now - startTime;
Console.WriteLine("Time elapsed: " +
Convert.ToString(runTime.TotalMilliseconds / 1000));
}
Results in the time elapsed (on my machine) going from 0.047 to 2.17. But is that just the overhead of adding a 100 million addition operators?
I'm a C++ and a C# developer. I've developed C# applications since the first beta of the .NET framework and I've had more than 20 years experience in developing C++ applications. Firstly, C# code will NEVER be faster than a C++ application, but I won't go through a lengthy discussion about managed code, how it works, the inter-op layer, memory management internals, the dynamic type system and the garbage collector. Nevertheless, let me continue by saying the the benchmarks listed here all produce INCORRECT results.
Let me explain: The first thing we need to consider is the JIT compiler for C# (.NET Framework 4). Now the JIT produces native code for the CPU using various optimization algorithms (which tend to be more aggressive than the default C++ optimizer that comes with Visual Studio) and the instruction set used by the .NET JIT compiler are a closer reflection of the actual CPU on the machine so certain substitutions in the machine code could be made to reduce clock cycles and improve the hit rate in the CPU pipeline cache and produce further hyper-threading optimizations such us instruction reordering and improvements relating to branch prediction.
What this means is that unless you compile your C++ application using the correct pararmeters for the RELEASE build (not the DEBUG build) then your C++ application may perform more slowly than the corresponding C# or .NET based application. When specifying the project properties on your C++ application, make sure you enable "full optimization" and "favour fast code". If you have a 64 bit machine, you MUST specify to generate x64 as the target platform, otherwise your code will be executed through a conversion sub-layer (WOW64) which will substantially reduce performance.
Once you perform the correct optimizations in the compiler, I get .72 seconds for the C++ application and 1.16 seconds for the C# application (both in release build). Since the C# application is very basic and allocates the memory used in the loop on the stack and not on the heap, it is actually performing a lot better than a real application involved in objects, heavy computations and with larger data-sets. So the figures provided are optimistic figures biased towards C# and the .NET framework. Even with this bias, the C++ application completes in just over half the time than the equivalent C# application. Keep in mind that the Microsoft C++ compiler I used did not have the right pipeline and hyperthreading optimizations (using WinDBG to view the assembly instructions).
Now if we use the Intel compiler (which by the way is an industry secret for generating high performance applications on AMD/Intel processors), the same code executes in .54 seconds for the C++ executable vs the .72 seconds using Microsoft Visual Studio 2010. So in the end, the final results are .54 seconds for C++ and 1.16 seconds for C#. So the code produce by the .NET JIT compiler takes 214% times longer than the C++ executable. Most of the time spent in the .54 seconds was in getting the time from the system and not within the loop itself!
What is also missing in the statistics is the startup and cleanup times which are not included in the timings. C# applications tend to spend a lot more time on start-up and on termination than C++ applications. The reason behind this is complicated and has to do with the .NET runtime code validation routines and the memory management subsystem which performs a lot of work at the beginning (and consequently, the end) of the program to optimize the memory allocations and the garbage collector.
When measuring the performance of C++ and .NET IL, it is important to look at the assembly code to make sure that ALL the calculations are there. What I found is that without putting some additional code in C#, most of the code in the examples above were actually removed from the binary. This was also the case with C++ when you used a more aggressive optimizer such as the one that comes with the Intel C++ compiler. The results I provided above are 100% correct and validated at the assembly level.
The main problem with a lot of forums on the internet that a lot of newbie's listen to Microsoft marketing propaganda without understanding the technology and make false claims that C# is faster than C++. The claim is that in theory, C# is faster than C++ because the JIT compiler can optimize the code for the CPU. The problem with this theory is that there is a lot of plumbing that exists in the .NET framework that slows the performance; plumbing which does not exist in C++ application. Furthermore, an experienced developer will know the right compiler to use for the given platform and use the appropriate flags when compiling the application. On the Linux or open source platforms, this is not a problem because you could distribute your source and create installation scripts that compile the code using the appropriate optimization. On the windows or closed source platform, you will have to distribute multiple executables, each with specific optimizations. The windows binaries that will be deployed are based on the CPU detected by the msi installer (using custom actions).
I put together (based on your code) two more comparable tests in C and C#. These two write a smaller array using the modulus operator for indexing (it adds a little overhead, but hey, we're trying to compare performance [at a crude level]).
C code:
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <math.h>
void main()
{
int count = (int)1e8;
int subcount = 1000;
double* roots = (double*)malloc(sizeof(double) * subcount);
clock_t start = clock();
for (int i = 0 ; i < count; i++)
{
roots[i % subcount] = sqrt((double)i);
}
clock_t end = clock();
double length = ((double)end - start) / CLOCKS_PER_SEC;
printf("Time elapsed: %f\n", length);
}
In C#:
using System;
namespace CsPerfTest
{
class Program
{
static void Main(string[] args)
{
int count = (int)1e8;
int subcount = 1000;
double[] roots = new double[subcount];
DateTime startTime = DateTime.Now;
for (int i = 0; i < count; i++)
{
roots[i % subcount] = Math.Sqrt(i);
}
TimeSpan runTime = DateTime.Now - startTime;
Console.WriteLine("Time elapsed: " + Convert.ToString(runTime.TotalMilliseconds / 1000));
}
}
}
These tests write data to an array (so the .NET runtime shouldn't be allowed to cull the sqrt op) although the array is significantly smaller (didn't want to use excessive memory). I compiled these in release config and run them from inside a console window (instead of starting through VS).
On my computer the C# program varies between 6.2 and 6.9 seconds, while the C version varies between 6.9 and 7.1.