How to allocate arrays on the stack for performance gains?

前端 未结 1 1167
南笙
南笙 2021-01-29 15:06

Some of the most optimal versions of functions like popcount and count consecutive zeros use table lookups to get the final answer.

In C and C+

1条回答
  •  囚心锁ツ
    2021-01-29 15:38

    I have a small lookup table that I'd like to be able to access as quickly as possible and thus would prefer to allocate it on the stack rather than heap.

    That statement is confusing. Putting something on the stack means it has to be reinitialized every time you enter the function in which it's declared. The usual "optimization" is to instead store such data in a persistent location, such as a static variable.

    For example, here's a sample popcount() implementation from the Hamming weight Wikipedia article:

    static uint8_t wordbits[65536] = { /* bitcounts of integers 0 through 65535, inclusive */ };
    static int popcount(uint32_t i)
    {
        return (wordbits[i&0xFFFF] + wordbits[i>>16]);
    }
    

    Note that the wordbits array is declared outside of any function, as a static variable.

    A similar declaration in C# would be something like this:

    static readonly byte[] wordbits = { /* bitcounts of integers 0 through 65535, inclusive */  };
    static int popcount(uint i)
    {
        return (wordbits[i & 0xFFFF] + wordbits[i >> 16]);
    }
    

    Note the use of C#'s readonly keyword to make clear that this object will only ever be initialized once.

    (Obviously, in both examples the comment in the array is replaced by actual values. Alternatively, they can be computed once at runtime and saved into the array).

    From your question, it seems like maybe you're at least a little confused about stack vs. heap vs. data segment (i.e. a special range of memory read straight from an executable image into memory). For performance, stack allocations are useful if you're dealing with fixed-sized objects that are allocated frequently and which you don't want to suffer the cost of allocating via the memory manager.

    But allocating on the stack doesn't offer any performance benefit in terms of actually accessing the data, and definitely also does not offer any performance benefit in terms of initializing the data. Indeed, on the latter count it would cost you more because you'd have to initialize it each time you enter the function.

    I believe that the above ought to adequately address your concern. But if not, please review what it is you're actually trying to do, and edit your question so that it is more clear. You can check How do I ask a good question for advice on how to better present your question in a clear, answerable way.

    0 讨论(0)
提交回复
热议问题