Understanding restrict qualifier by examples

后端 未结 1 2012
北恋
北恋 2021-01-01 17:16

The restrict keyword\'s behavior is defined in C99 by 6.7.3.1:

Let D be a declaration of an ordinary identifier that provides a means o

相关标签:
1条回答
  • 2021-01-01 17:26

    Below, I will refer to the usecases from the Sun paper linked to in the question.

    The (relatively) obvious case would be the mem_copy() case, which falls under the 2nd usecase category in the Sun paper (the f1() function). Let's say we have the following two implementations:

    void mem_copy_1(void * restrict s1, const void * restrict s2, size_t n);
    void mem_copy_2(void *          s1, const void *          s2, size_t n);
    

    Because we know there is no overlap between the two arrays pointed to by s1 and s2, the code for the 1st function would be straight forward:

    void mem_copy_1(void * restrict s1, const void * restrict s2, size_t n)
    {
         // naively copy array s2 to array s1.
         for (int i=0; i<n; i++)
             s1[i] = s2[i];
         return;
    }
    

    s2 = '....................1234567890abcde' <- s2 before the naive copy
    s1 = '1234567890abcde....................' <- s1 after the naive copy
    s2 = '....................1234567890abcde' <- s2 after the naive copy

    OTOH, in the 2nd function, there may be an overlap. In this case, we need to check whether the source array is located before the destination or vice-versa, and choose the loop index boundaries accordingly.

    For example, say s1 = 100 and s2 = 105. Then, if n=15, after the copy the newly copied s1 array will overrun the first 10 bytes of the source s2 array. We need to make sure we copied the lower bytes first.

    s2 = '.....1234567890abcde' <- s2 before the naive copy
    s1 = '1234567890abcde.....' <- s1 after the naive copy
    s2 = '.....67890abcdeabcde' <- s2 after the naive copy

    However, if, s1 = 105 and s2 = 100, then writing the lower bytes first will overrun the last 10 bytes of the source s2, and we end up with an erroneous copy.

    s2 = '1234567890abcde.....' <- s2 before the naive copy
    s1 = '.....123451234512345' <- s1 after the naive copy - not what we wanted
    s2 = '123451234512345.....' <- s2 after the naive copy

    In this case, we need to copy the last bytes of the array first, possibly stepping backwards. The code will look something like:

    void mem_copy_2(void *s1, const void *s2, size_t n)
    {
        if (((unsigned) s1) < ((unsigned) s2))
            for (int i=0; i<n; i++)
                 s1[i] = s2[i];
        else
            for (int i=(n-1); i>=0; i--)
                 s1[i] = s2[i];
        return;
    }
    

    It is easy to see how the restrict modifier gives a chance for better speed optimization, eliminating extra code, and an if-else decision.

    At the same time, this situation is hazardous to the incautious programmer, who passes overlapping arrays to the restrict-ed function. In this case, no guards are there for ensuring the proper copying of the array. Depending on the optimization path chosen by the compiler, the result is undefined.


    The 1st usecase (the init() function) can be seen as a variation on the 2nd one, described above. Here, two arrays are created with a single dynamic memory allocation call.

    Designating the two pointers as restrict-ed enables optimization in which the instructions order would matter otherwise. For example, if we have the code:

    a1[5] = 4;
    a2[3] = 8;
    

    then the optimizer can reorder these statements if it finds it useful.

    OTOH, if the pointers are not restrict-ed, then it is important that the 1st assignment will be performed before the second one. This is because there is a possibility that a1[5] and a2[3] are actually the same memory location! It is easy to see that when this is the case, then the end value there should be 8. If we reorder the instructions, then the end value will be 4!

    Again, if non-disjoint pointers are given to this restrict-ed assumed code, the result is undefined.

    0 讨论(0)
提交回复
热议问题