Is there any performance difference between ++i and i++ in C#?

匿名 (未验证) 提交于 2019-12-03 08:33:39

问题:

Is there any performance difference between using something like

for(int i = 0; i 

and

for(int i = 0; i 

or is the compiler able to optimize in such a way that they are equally fast in the case where they are functionally equivalent?

Edit: This was asked because I had a discussion with a co-worker about it, not because I think its a useful optimization in any practical sense. It is largely academic.

回答1:

There is no difference in the generated intermediate code for ++i and i++ in this case. Given this program:

class Program {     const int counter = 1024 * 1024;     static void Main(string[] args)     {         for (int i = 0; i 

The generated IL code is the same for both loops:

  IL_0000:  ldc.i4.0   IL_0001:  stloc.0   // Start of first loop   IL_0002:  ldc.i4.0   IL_0003:  stloc.0   IL_0004:  br.s       IL_0010   IL_0006:  ldloc.0   IL_0007:  call       void [mscorlib]System.Console::WriteLine(int32)   IL_000c:  ldloc.0   IL_000d:  ldc.i4.1   IL_000e:  add   IL_000f:  stloc.0   IL_0010:  ldloc.0   IL_0011:  ldc.i4     0x100000   IL_0016:  blt.s      IL_0006   // Start of second loop   IL_0018:  ldc.i4.0   IL_0019:  stloc.0   IL_001a:  br.s       IL_0026   IL_001c:  ldloc.0   IL_001d:  call       void [mscorlib]System.Console::WriteLine(int32)   IL_0022:  ldloc.0   IL_0023:  ldc.i4.1   IL_0024:  add   IL_0025:  stloc.0   IL_0026:  ldloc.0   IL_0027:  ldc.i4     0x100000   IL_002c:  blt.s      IL_001c   IL_002e:  ret 

That said, it's possible (although highly unlikely) that the JIT compiler can do some optimizations in certain contexts that will favor one version over the other. If there is such an optimization, though, it would likely only affect the final (or perhaps the first) iteration of a loop.

In short, there will be no difference in the runtime of simple pre-increment or post-increment of the control variable in the looping construct that you've described.



回答2:

Ah... Open again. OK. Here's the deal.

ILDASM is a start, but not an end. The key is: What will the JIT generate for assembly code?

Here's what you want to do.

Take a couple samples of what you are trying to look at. Obviously you can wall-clock time them if you want - but I assume you want to know more than that.

Here's what's not obvious. The C# compiler generates some MSIL sequences that are non-optimal in a lot of situations. The JIT it tuned to deal with these and quirks from other languages. The problem: Only 'quirks' someone has noticed have been tuned.

You really want to make a sample that has your implementations to try, returns back up to main (or wherever), Sleep()s, or something where you can attach a debugger, then run the routines again.

You DO NOT want to start the code under the debugger or the JIT will generate non-optimized code - and it sounds like you want to know how it will behave in a real environment. The JIT does this to maximize debug info and minimize the current source location from 'jumping around'. Never start a perf evaluation under the debugger.

OK. So once the code has run once (ie: The JIT has generated code for it), then attach the debugger during the sleep (or whatever). Then look at the x86/x64 that was generated for the two routines.

My gut tells me that if you are using ++i/i++ as you described - ie: in a stand alone expression where the rvalue result is not re-used - there won't be a difference. But won't it be fun to go find out and see all the neat stuff! :)



回答3:

Guys, guys, the "answers" are for C and C++.

C# is a different animal.

Use ILDASM to look at the compiled output to verify if there is an MSIL difference.



回答4:

Have a concrete piece of code and CLR release in mind? If so, benchmark it. If not, forget about it. Micro-optimization, and all that... Besides, you can't even be sure different CLR release will produce the same result.



回答5:

As Jim Mischel has shown, the compiler will generate identical MSIL for the two ways of writing the for-loop.

But that is it then: there is no reason to speculate about the JIT or perform speed-measurements. If the two lines of code generate identical MSIL, not only will they perform identically, they are effectively identical.

No possible JIT would be able to distinguish between the loops, so the generated machine code must necessarily be identical, too.



回答6:

In addition to other answers, there can be a difference if your i is not an int. In C++, if it is an object of a class that has operators ++() and ++(int) overloaded, then it can make a difference, and possibly a side effect. Performance of ++i should be better in this case (dependant on the implementation).



回答7:

If you're asking this question, you're trying to solve the wrong problem.

The first question to ask is "how to I improve customer satisfaction with my software by making it run faster?" and the answer is almost never "use ++i instead of i++" or vice versa.

From Coding Horror's post "Hardware is Cheap, Programmers are Expensive":

Rules of Optimization:
Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it yet.
-- M.A. Jackson

I read rule 2 to mean "first write clean, clear code that meets your customer's needs, then speed it up where it's too slow". It's highly unlikely that ++i vs. i++ is going to be the solution.



回答8:

According to this answer, i++ uses one CPU instruction more than ++i. But whether this results in a performance difference, I don't know.

Since either loop can easily be rewritten to use either a post-increment or a pre-increment, I guess that the compiler will always use the more efficient version.



回答9:

  static void Main(string[] args) {      var sw = new Stopwatch(); sw.Start();      for (int i = 0; i 

Average from 3 runs:
for with i++: 1307 for with ++i: 1314

while with i++ : 1261 while with ++i : 1276

That's a Celeron D at 2,53 Ghz. Each iteration took about 1.6 CPU cycles. That either means that the CPU was executing more than 1 instruction each cycle or that the JIT compiler unrolled the loops. The difference between i++ and ++i was only 0.01 CPU cycles per iteration, probably caused by the OS services in the background.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!