Coding Practices which enable the compiler/optimizer to make a faster program

后端 未结 30 1781
一个人的身影
一个人的身影 2020-12-02 03:24

Many years ago, C compilers were not particularly smart. As a workaround K&R invented the register keyword, to hint to the compiler, that maybe it woul

相关标签:
30条回答
  • 2020-12-02 03:58

    use const correctness as much as possible in your code. It allows the compiler to optimize much better.

    In this document are loads of other optimization tips: CPP optimizations (a bit old document though)

    highlights:

    • use constructor initialization lists
    • use prefix operators
    • use explicit constructors
    • inline functions
    • avoid temporary objects
    • be aware of the cost of virtual functions
    • return objects via reference parameters
    • consider per class allocation
    • consider stl container allocators
    • the 'empty member' optimization
    • etc
    0 讨论(0)
  • 2020-12-02 03:59

    Don't do the same work over and over again!

    A common antipattern that I see goes along these lines:

    void Function()
    {
       MySingleton::GetInstance()->GetAggregatedObject()->DoSomething();
       MySingleton::GetInstance()->GetAggregatedObject()->DoSomethingElse();
       MySingleton::GetInstance()->GetAggregatedObject()->DoSomethingCool();
       MySingleton::GetInstance()->GetAggregatedObject()->DoSomethingReallyNeat();
       MySingleton::GetInstance()->GetAggregatedObject()->DoSomethingYetAgain();
    }
    

    The compiler actually has to call all of those functions all of the time. Assuming you, the programmer, knows that the aggregated object isn't changing over the course of these calls, for the love of all that is holy...

    void Function()
    {
       MySingleton* s = MySingleton::GetInstance();
       AggregatedObject* ao = s->GetAggregatedObject();
       ao->DoSomething();
       ao->DoSomethingElse();
       ao->DoSomethingCool();
       ao->DoSomethingReallyNeat();
       ao->DoSomethingYetAgain();
    }
    

    In the case of the singleton getter the calls may not be too costly, but it is certainly a cost (typically, "check to see if the object has been created, if it hasn't, create it, then return it). The more complicated this chain of getters becomes, the more wasted time we'll have.

    0 讨论(0)
  • 2020-12-02 04:01

    On most modern processors, the biggest bottleneck is memory.

    Aliasing: Load-Hit-Store can be devastating in a tight loop. If you're reading one memory location and writing to another and know that they are disjoint, carefully putting an alias keyword on the function parameters can really help the compiler generate faster code. However if the memory regions do overlap and you used 'alias', you're in for a good debugging session of undefined behaviors!

    Cache-miss: Not really sure how you can help the compiler since it's mostly algorithmic, but there are intrinsics to prefetch memory.

    Also don't try to convert floating point values to int and vice versa too much since they use different registers and converting from one type to another means calling the actual conversion instruction, writing the value to memory and reading it back in the proper register set.

    0 讨论(0)
  • 2020-12-02 04:03

    One thing I've done is try to keep expensive actions to places where the user might expect the program to delay a bit. Overall performance is related to responsiveness, but isn't quite the same, and for many things responsiveness is the more important part of performance.

    The last time I really had to do improvements in overall performance, I kept an eye out for suboptimal algorithms, and looked for places that were likely to have cache problems. I profiled and measured performance first, and again after each change. Then the company collapsed, but it was interesting and instructive work anyway.

    0 讨论(0)
  • 2020-12-02 04:04

    The optimizer isn't really in control of the performance of your program, you are. Use appropriate algorithms and structures and profile, profile, profile.

    That said, you shouldn't inner-loop on a small function from one file in another file, as that stops it from being inlined.

    Avoid taking the address of a variable if possible. Asking for a pointer isn't "free" as it means the variable needs to be kept in memory. Even an array can be kept in registers if you avoid pointers — this is essential for vectorizing.

    Which leads to the next point, read the ^#$@ manual! GCC can vectorize plain C code if you sprinkle a __restrict__ here and an __attribute__( __aligned__ ) there. If you want something very specific from the optimizer, you might have to be specific.

    0 讨论(0)
  • 2020-12-02 04:04

    When DEC came out with its alpha processors, there was a recommendation to keep the number of arguments to a function under 7, as the compiler would always try to put up to 6 arguments in registers automatically.

    0 讨论(0)
提交回复
热议问题