Quickly find whether a value is present in a C array?

前端 未结 15 1279
灰色年华
灰色年华 2021-01-29 17:30

I have an embedded application with a time-critical ISR that needs to iterate through an array of size 256 (preferably 1024, but 256 is the minimum) and check if a value matches

15条回答
  •  梦毁少年i
    2021-01-29 18:13

    Keep the table in sorted order, and use Bentley's unrolled binary search:

    i = 0;
    if (key >= a[i+512]) i += 512;
    if (key >= a[i+256]) i += 256;
    if (key >= a[i+128]) i += 128;
    if (key >= a[i+ 64]) i +=  64;
    if (key >= a[i+ 32]) i +=  32;
    if (key >= a[i+ 16]) i +=  16;
    if (key >= a[i+  8]) i +=   8;
    if (key >= a[i+  4]) i +=   4;
    if (key >= a[i+  2]) i +=   2;
    if (key >= a[i+  1]) i +=   1;
    return (key == a[i]);
    

    The point is,

    • if you know how big the table is, then you know how many iterations there will be, so you can fully unroll it.
    • Then, there's no point testing for the == case on each iteration because, except on the last iteration, the probability of that case is too low to justify spending time testing for it.**
    • Finally, by expanding the table to a power of 2, you add at most one comparison, and at most a factor of two storage.

    ** If you're not used to thinking in terms of probabilities, every decision point has an entropy, which is the average information you learn by executing it. For the >= tests, the probability of each branch is about 0.5, and -log2(0.5) is 1, so that means if you take one branch you learn 1 bit, and if you take the other branch you learn one bit, and the average is just the sum of what you learn on each branch times the probability of that branch. So 1*0.5 + 1*0.5 = 1, so the entropy of the >= test is 1. Since you have 10 bits to learn, it takes 10 branches. That's why it's fast!

    On the other hand, what if your first test is if (key == a[i+512)? The probability of being true is 1/1024, while the probability of false is 1023/1024. So if it's true you learn all 10 bits! But if it's false you learn -log2(1023/1024) = .00141 bits, practically nothing! So the average amount you learn from that test is 10/1024 + .00141*1023/1024 = .0098 + .00141 = .0112 bits. About one hundredth of a bit. That test is not carrying its weight!

提交回复
热议问题