Is there a performance difference between i++ and ++i in C?

前端 未结 14 847
遥遥无期
遥遥无期 2020-11-22 10:08

Is there a performance difference between i++ and ++i if the resulting value is not used?

相关标签:
14条回答
  • 2020-11-22 10:37

    Executive summary: No.

    i++ could potentially be slower than ++i, since the old value of i might need to be saved for later use, but in practice all modern compilers will optimize this away.

    We can demonstrate this by looking at the code for this function, both with ++i and i++.

    $ cat i++.c
    extern void g(int i);
    void f()
    {
        int i;
    
        for (i = 0; i < 100; i++)
            g(i);
    
    }
    

    The files are the same, except for ++i and i++:

    $ diff i++.c ++i.c
    6c6
    <     for (i = 0; i < 100; i++)
    ---
    >     for (i = 0; i < 100; ++i)
    

    We'll compile them, and also get the generated assembler:

    $ gcc -c i++.c ++i.c
    $ gcc -S i++.c ++i.c
    

    And we can see that both the generated object and assembler files are the same.

    $ md5 i++.s ++i.s
    MD5 (i++.s) = 90f620dda862cd0205cd5db1f2c8c06e
    MD5 (++i.s) = 90f620dda862cd0205cd5db1f2c8c06e
    
    $ md5 *.o
    MD5 (++i.o) = dd3ef1408d3a9e4287facccec53f7d22
    MD5 (i++.o) = dd3ef1408d3a9e4287facccec53f7d22
    
    0 讨论(0)
  • 2020-11-22 10:40

    Here's an additional observation if you're worried about micro optimisation. Decrementing loops can 'possibly' be more efficient than incrementing loops (depending on instruction set architecture e.g. ARM), given:

    for (i = 0; i < 100; i++)
    

    On each loop you you will have one instruction each for:

    1. Adding 1 to i.
    2. Compare whether i is less than a 100.
    3. A conditional branch if i is less than a 100.

    Whereas a decrementing loop:

    for (i = 100; i != 0; i--)
    

    The loop will have an instruction for each of:

    1. Decrement i, setting the CPU register status flag.
    2. A conditional branch depending on CPU register status (Z==0).

    Of course this works only when decrementing to zero!

    Remembered from the ARM System Developer's Guide.

    0 讨论(0)
  • 2020-11-22 10:40

    @Mark Even though the compiler is allowed to optimize away the (stack based) temporary copy of the variable and gcc (in recent versions) is doing so, doesn't mean all compilers will always do so.

    I just tested it with the compilers we use in our current project and 3 out of 4 do not optimize it.

    Never assume the compiler gets it right, especially if the possibly faster, but never slower code is as easy to read.

    If you don't have a really stupid implementation of one of the operators in your code:

    Alwas prefer ++i over i++.

    0 讨论(0)
  • 2020-11-22 10:43

    Short answer:

    There is never any difference between i++ and ++i in terms of speed. A good compiler should not generate different code in the two cases.

    Long answer:

    What every other answer fails to mention is that the difference between ++i versus i++ only makes sense within the expression it is found.

    In the case of for(i=0; i<n; i++), the i++ is alone in its own expression: there is a sequence point before the i++ and there is one after it. Thus the only machine code generated is "increase i by 1" and it is well-defined how this is sequenced in relation to the rest of the program. So if you would change it to prefix ++, it wouldn't matter in the slightest, you would still just get the machine code "increase i by 1".

    The differences between ++i and i++ only matters in expressions such as array[i++] = x; versus array[++i] = x;. Some may argue and say that the postfix will be slower in such operations because the register where i resides have to be reloaded later. But then note that the compiler is free to order your instructions in any way it pleases, as long as it doesn't "break the behavior of the abstract machine" as the C standard calls it.

    So while you may assume that array[i++] = x; gets translated to machine code as:

    • Store value of i in register A.
    • Store address of array in register B.
    • Add A and B, store results in A.
    • At this new address represented by A, store the value of x.
    • Store value of i in register A // inefficient because extra instruction here, we already did this once.
    • Increment register A.
    • Store register A in i.

    the compiler might as well produce the code more efficiently, such as:

    • Store value of i in register A.
    • Store address of array in register B.
    • Add A and B, store results in B.
    • Increment register A.
    • Store register A in i.
    • ... // rest of the code.

    Just because you as a C programmer is trained to think that the postfix ++ happens at the end, the machine code doesn't have to be ordered in that way.

    So there is no difference between prefix and postfix ++ in C. Now what you as a C programmer should be vary of, is people who inconsistently use prefix in some cases and postfix in other cases, without any rationale why. This suggests that they are uncertain about how C works or that they have incorrect knowledge of the language. This is always a bad sign, it does in turn suggest that they are making other questionable decisions in their program, based on superstition or "religious dogmas".

    "Prefix ++ is always faster" is indeed one such false dogma that is common among would-be C programmers.

    0 讨论(0)
  • 2020-11-22 10:43

    I can think of a situation where postfix is slower than prefix increment:

    Imagine a processor with register A is used as accumulator and it's the only register used in many instructions (some small microcontrollers are actually like this).

    Now imagine the following program and their translation into a hypothetical assembly:

    Prefix increment:

    a = ++b + c;
    
    ; increment b
    LD    A, [&b]
    INC   A
    ST    A, [&b]
    
    ; add with c
    ADD   A, [&c]
    
    ; store in a
    ST    A, [&a]
    

    Postfix increment:

    a = b++ + c;
    
    ; load b
    LD    A, [&b]
    
    ; add with c
    ADD   A, [&c]
    
    ; store in a
    ST    A, [&a]
    
    ; increment b
    LD    A, [&b]
    INC   A
    ST    A, [&b]
    

    Note how the value of b was forced to be reloaded. With prefix increment, the compiler can just increment the value and go ahead with using it, possibly avoid reloading it since the desired value is already in the register after the increment. However, with postfix increment, the compiler has to deal with two values, one the old and one the incremented value which as I show above results in one more memory access.

    Of course, if the value of the increment is not used, such as a single i++; statement, the compiler can (and does) simply generate an increment instruction regardless of postfix or prefix usage.


    As a side note, I'd like to mention that an expression in which there is a b++ cannot simply be converted to one with ++b without any additional effort (for example by adding a - 1). So comparing the two if they are part of some expression is not really valid. Often, where you use b++ inside an expression you cannot use ++b, so even if ++b were potentially more efficient, it would simply be wrong. Exception is of course if the expression is begging for it (for example a = b++ + 1; which can be changed to a = ++b;).

    0 讨论(0)
  • 2020-11-22 10:43

    I have been reading through most of the answers here and many of the comments, and I didn't see any reference to the one instance that I could think of where i++ is more efficient than ++i (and perhaps surprisingly --i was more efficient than i--). That is for C compilers for the DEC PDP-11!

    The PDP-11 had assembly instructions for pre-decrement of a register and post-increment, but not the other way around. The instructions allowed any "general-purpose" register to be used as a stack pointer. So if you used something like *(i++) it could be compiled into a single assembly instruction, while *(++i) could not.

    This is obviously a very esoteric example, but it does provide the exception where post-increment is more efficient(or I should say was, since there isn't much demand for PDP-11 C code these days).

    0 讨论(0)
提交回复
热议问题