How to write more efficient code

雨燕双飞 提交于 2019-12-03 17:19:45

The first question is: Do you want to optimize it? Most probably, you don't want to. At least not if you "always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live." Readability, clarity of intent and maintainability always come first.

The second question is: Is it worth optimizing? In 97% it isn't, according to Donald Knuth - and you don't question Knuth, do you? Another common rule-of-thumb is the 80/20 rule, i.e. 80% of execution time is spent in 20% of the code. If you optimize at all, first profile to know where to optimize. If you guess, you're wrong. Period.

The third question is: CAN you optimize it? No you can't, at least not this easily. You think you're smarter than the hundreds of programmers who wrote your compiler over many decades? You aren't. If the actual implementation of your algorithm and data structures can be optimized, you can assume your compiler can do that on its own. The compiler can do loop unrolling, instruction reordering, combining variables with non-overlapping lifetime, struct layout optimization, and much much more - and in this era, it's even better than most assembly programmers in most cases. And even if there's a bit of potential, you better focus on implementing a better algorithm. No compiler can turn O(n^2) into O(n log n), but maybe a smart computer scientist did it and you can implement his algorithm to get much better performance than any microoptimization can yield.

Is efficiency more important that maintainability and readability? The answer is no. Even if you have a time-critical application, you will spend 90% of the time in under 10% of the code, and so only that 10% needs to be coded as efficiently as possible.

If you haven't measured and found which 10% is the culprit, you almost certainly will be optimising code which doesn't take much runtime in the first place. This is a waste of time.

You would have to measure for each platform you want to do this one.

However I don't think this would make any noticeable difference at all. (Maybe except for some embedded platforms. That's an area I don't know much about.) So first write the code in the way that's easiest to read and understand. Then measure whether your code is to slow, and use a profiler to find the exact spots where the program spends to much time. Then try to improve those, measuring after each change to see which effect it had.

Improving the speed of an easy to understand codebase is much easier than understanding a codebase that's riddled with premature and unneeded "optimization".

Measurable run-time improvements usually come from algorithmic changes, not from micro-optimizations like this one. Spend your time trying to find better algorithms.

If you really want to micro-optimize this, use the SIMD instruction capabilities of your CPU. If you are using an x86 platform, you can use MMX or SSE instructions to do vector arithmetic instead of adding each part of the coordinate individually (your compiler may not generate these without special command-line switches or inline assembly). This will probably amount to a larger speedup than switching between individual variables and an array. I say "probably" because there is no way to tell for sure without trying it both ways and measuring the execution time.

Use an array, compile with -funroll-loops. You get the benefits of both.

Compilers can "unroll" loops if they think it will help. So the compiler might quietly replace the following code:

for (i = 0; i < 3; ++i) {
    c[i] = a[i] - b[i];
}

with:

c[0] = a[0] - b[0];
c[1] = a[1] - b[1];
c[2] = a[2] - b[2];

The compiler will take its best guess as to whether it's worth doing this, taking into account the costs of the operations involved and the optimisation flags you provide.

There is no simple answer to "which one will be faster?", but if there was you can be sure that with maximum optimisation, your compiler would use it.

When in doubt, code up prototypes for each and profile them. For stuff at this level, I will predict that any differences in performance will be down in the noise. Use what makes sense and most clearly conveys the intent of the design.

In descending order of importance, code must be

  1. Correct - it doesn't matter how fast your code is if it gives you the wrong answer or does the wrong thing;
  2. Maintainable - it doesn't matter how fast your code is if you can't fix it or modify it to accomodate new requirements;
  3. Robust - it doesn't matter how fast your code is if it core dumps on the first hint of dodgy input;

Somewhere after this you can start worrying about performance.

The answer of the century is

Don't put the cart before the horse.

In other words, profile first.

Everybody "knows" this, but a large category of questions on SO are of the form "Which is faster, X or Y?"

This elicits guesses, and when you're concerned about performance, guesses aren't worth much, because if you have a performance problem, it is probably somewhere else entirely.

I usually don't worry about efficiency...

One place where it speeds things up is if I do a search for a numeric value Say I want to find an account number "188335344" it will happen far faster than searching for alpha characters. The search must switch each line of text to upper case as it searches for non numeric values. Not so for numbers.

Quite a bit faster actually.

Anything that requires user input can be extremely inefficient and it won't matter an iota.

I do display the elapsed time at the end of each search. So older code can be compared against more recent changes.

As always, you'll need to profile your code to be sure.

Having said that, I'd suggest going for an array and loops - you code should be more concise/maintainable and the compiler should be able to do a good job at optimising out / unrolling all the little constant-sized loops which is effectively what you would be doing by hand if you used x,y,z co-ordinates for every vector.

On a completely unrelated note I see you car has altitude. Is it a flying car? If so then definitely +1 for the cool application.

Go for the more correct way - which would be using loops and arrays - neither of which result in more memory usage (less usage, as the memory required by all those car1, car2, car3... instructions will be more) - and CPU usage-wise you're look at the tiniest of differences.

Please profile your code and find out what's the main problem for the inefficiency. Efficiency can be measured by runtime execution of code.

Some tools to do so are available as opensource like gprof.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!