Should we still be optimizing “in the small”?

后端未结

关注

 22  1975

I was changing my for loop to increment using ++i instead of i++ and got to thinking, is this really necessary anymore? Surely today\'s compilers

相关标签:

22条回答

我寻月下人不归

2020-12-05 11:50

Last time I tested ++it and it++ on the Microsoft C++ compiler for STL iterators, ++it emitted less code, so if you're in a massive loop you may get a small performance gain using ++it.

For integers etc the compiler will emit identical code.

0 讨论(0)
发布评论:

提交评论
- 加载中...
逝去的感伤

2020-12-05 11:51

Sure, if and only if it results in an actual improvement for that particular program which is significant enough to be worth the coding time, any readability decrease, etc. I don't think you can make a rule for this across all programs, or really for any optimization. It's entirely dependent on what actually matters in the particular case.

With something like ++i, the time and readability tradeoff is so minor, it may well be worth making a habit of, if it actually results in improvement.

0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2020-12-05 11:52

There are three quotes I believe that every developer should know with regard to optimization - I first read them in Josh Bloch's "Effective Java" book:

More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity.

(William A. Wulf)

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

(Donald E. Knuth)

We follow two rules in the matter of optimization:

Rule 1: Don't do it.

Rule 2: (for experts only). Don't do it yet - that is, not until you have a perfectly clear and unoptimized solution.

(M. A. Jackson)

All these quotes are (AFAIK) at least 20-30 years old, a time where CPU and memory meant much more than today. I believe is right way to develop software is first to have a working solution, and then use a profiler to test where are the performance bottlenecks. A friend once told me about an application that was written in C++ and Delphi, and had performance issues. Using a profiler they found out that the application spent a considerable amount of time converting strings from Delphi's structure to the C++ one and vice versa - no micro optimization can detect that...

To conclude, don't think that you know where the performance issues will be. Use a profiler for this.

0 讨论(0)
发布评论:

提交评论
- 加载中...
迷失自我

2020-12-05 11:54
First of all - always run profiling to check.

Firstly if you optimalizing right part of code. If the code is run by 1% of total time - forget. Even if you spped it up by 50% you gane whole 0.5% total speedup. Unless you are doing something strange speedup will be much slower (especially if you did used good optimalizing compiler). Secondly if you optimalizing it right. Which code would run faster on x86?
```
inc eax
```
or
```
add eax, 1
```
Well. As far as I know in earlier processors the first one but on P4 the second one (it is irrelevant here if those specific instructions are run faster or slower the point is that it changes all the time). The compiler may be up-to-date with such changes - you will not.

In my opinion the primary target is the optimalizing that cannot be performed by compilator - as mentioned earlier the data size (you may think that it is not needed on nowadays 2 GiB computers - but if your data is bigger then processor cache - it will run much slowler).

In general - do it only if you must and/or you know what you are doing. It will require an amount of knowledge about the code, compiler and the low-level computer architecture that is not metioned in the question (and to be honest - I do not posess). And it will likely gain nothing. If you want to optimize - do it on more highier level.
0 讨论(0)
发布评论:

提交评论
- 加载中...
礼貌的吻别

2020-12-05 11:59
All of the optimizations you listed are practically irrelevant these days for C programmers -- the compiler is much, much better at performing things like inlining, loop unrolling, loop jamming, loop inversion, and strength reduction.

Regarding ++i versus i++: for integers, they generate identical machine code, so which one you use is a matter of style/preference. In C++, objects can overload those pre- and postincrement operators, in which case it's usually preferable to use a preincrement, because a postincrement necessitates an extra object copy.

As for using shifts instead of multiplications by powers of 2, again, the compiler already does that for you. Depending on the architecture, it can do even more clever things, such as turning a multiplication by 5 into a single lea instruction on x86. However, with divisions and moduli by powers of 2, you might need to pay a little more attention to get the optimal code. Suppose you write:
```
x = y / 2;
```
If x and y are signed integers, the compiler can't turn that into a right shift because it will yield an erroneous result for negative numbers. So it, emits a right shift and some bit twiddling instructions to make sure the result is correct for both positive and negative numbers. If you know x and y are always positive, then you should help the compiler out and make them unsigned integers instead. Then, the compiler can optimize it into a single right shift instruction.

The modulus operator % works similarly -- if you're modding by a power of 2, with signed integers the compiler has to emit an and instruction plus a little more bit twiddling to make the result correct for positive and negative numbers, but it can emit a single and instruction if dealing with unsigned numbers.
0 讨论(0)
发布评论:

提交评论
- 加载中...
爱一瞬间的悲伤

2020-12-05 11:59
One also needs to be careful that changing from pre/post- increment/decrement operators doesn't introduce an undesirable side effect. For example, if you're iterating over a loop 5 times simply to run a set of code multiple times without any interest in the loop index value, you're probably okay (YMMV). On the other hand if you do access the loop index value than the result may not be what you expect:
```
#include <iostream>

int main()
{
  for (unsigned int i = 5; i != 0; i--)
    std::cout << i << std::endl;

  for (unsigned int i = 5; i != 0; --i)
    std::cout << "\t" << i << std::endl;

  for (unsigned int i = 5; i-- != 0; )
    std::cout << i << std::endl;

  for (unsigned int i = 5; --i != 0; )
    std::cout << "\t" << i << std::endl;
}
```
results in the following:
```
5
4
3
2
1
        5
        4
        3
        2
        1
4
3
2
1
0
        4
        3
        2
        1
```
The first two cases show no difference, but notice that attempting to "optimize" the fourth case by switching to a pre-decrement operator would result in an iteration being completely lost. Admittedly this is a bit of a contrived case but I have seen this sort of loop iteration (3rd case) when going through an array in reverse order, i.e. from end to start.
0 讨论(0)
发布评论:

提交评论
- 加载中...