malloc-free-malloc and strict-aliasing

前端 未结 4 1133
渐次进展
渐次进展 2021-01-13 03:06

I\'ve been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me

相关标签:
4条回答
  • 2021-01-13 03:48

    Your code is correct C and does not invoke undefined behaviour (except that you do not test malloc return value) because :

    • you allocate a bloc of memory, use it and free it
    • you allocate another bloc of memory, use it and free it.

    What is undefined is whether p16 will receive same value as p32 had at a different time

    What would be undefined behaviour, even if value was the same would be to access p32 after it has been freed. Examples :

    int main() {
        uint32_t *p32 = malloc(4);
        *p32 = 0;
        free(p32);
    
        uint16_t *p16 = malloc(4);
        p16[0] = 7;
        p16[1] = 7;
        if (p16 == p32) {         // whether p16 and p32 are equal is undefined
            uint32_t x = *p32;  // accessing *p32 is explicitely UB
        }
        free(p16);
    }
    

    It is UB because you try to access a memory block after it has been freed. And even when it does point to a memory block, that memory block has been initialized as an array of uint16_t, using it as a pointer to another type is formally undefined behaviour.


    Custom allocation (assuming a C99 conformant compiler) :

    So you have a big chunk of memory and want to write custom free and malloc functions without UB. It is possible. Here I will not go to far into the hard part of management of allocated and free blocs, and just give hints.

    1. you will need to know what it the strictest alignement for the implementation. stdlib malloc knows it because 7.20.3 §1 of C99 language specification (draft n1256) says : The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object. It is generally 4 on 32 bits systems and 8 on 64 bits systems, but might be greater or lesser ...
    2. you memory pool must be a char array because 6.3.2.3 §7 says : A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. : that means that provided you can deal with the alignement, a character array of correct size can be converted to a pointer to an arbitrary type (and is the base of malloc implementation)
    3. You must make your memory pool start at an address compatible with the system alignement :

      intptr_t orig_addr = chunk;
      int delta = orig_addr % alignment;
      char *pool = chunk + alignement - delta; /* pool in now aligned */
      

    You now only have to return from your own pool addresses of blocs got as pool + n * alignement and converted to void * : 6.3.2.3 §1 says : A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.

    It would be cleaner with C11, because C11 explicitely added _Alignas and alignof keywords to explictely deal with it and it would be better than the current hack. But it should work nonetheless

    Limits :

    I must admit that my interpretation of 6.3.2.3 §7 is that a pointer to a correctly aligned char array can be converted to a pointer of another type is not really neat and clear. Some may argue that what is said is just that if it originally pointed to the other type, it can be used as a char pointer. But as I start from a char pointer it is not explicitely allowed. That's true, but it is the best that can be done, it is not explicely marked as undefined behaviour ... and it is what malloc does under the hood.

    As alignement is explicitely implementation dependant, you cannot create a general library usable on any implementation.

    0 讨论(0)
  • 2021-01-13 03:48

    The actual rules regarding aliasing are laid out in standard section 6.5, paragraph 7. Note the wording:

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    (emphasis mine)

    Aliasing includes the notion of objects, not just general memory. For malloc to have returned the same address on a second use requires the original object to have been deallocated. Even if it has the same address, it is not considered the same object. Any attempts to access the first object through dangling pointers leftover after free are UB for completely different reasons, so there's no aliasing because any continued use of the first pointer p32 is invalid anyway.

    0 讨论(0)
  • 2021-01-13 03:54

    Note: This only answers initial question, not the part about custom allocators.


    No, it's not UB, because p16 now holds different object and the former is gone after you invoked free(p32).

    Note that malloc() returns pointer that is pre-aligned for every object, thus this avoids breaking of strict aliasing in practical terms. From C11 (N1570) 7.22.3/p1 Memory management functions (emphasis mine):

    The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation.

    0 讨论(0)
  • 2021-01-13 04:00

    Here are the C99 strict aliasing rules in (what I hope is) their entirety:

    6.5
    (6) The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

    (7) An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
    — a type compatible with the effective type of the object,
    — a qualified version of a type compatible with the effective type of the object,
    — a type that is the signed or unsigned type corresponding to the effective type of the object,
    — a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
    — an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
    — a character type.

    These two clauses together prohibit one specific case, storing a value via an lvalue of type X and then subsequently retrieving a value via an lvalue of type Y incompatible with X.

    So, as I read the standard, even this usage is perfectly OK (assuming 4 bytes are enough to store either an uint32_t or two uint16_t).

    int main() {
        uint32_t *p32 = malloc(4);
        *p32 = 0;
        /* do not do this: free(p32); */
    
        /* do not do this: uint16_t *p16 = malloc(4); */
        /* do this instead: */
        uint16_t *p16 = (uint16_t *)p32;
    
        p16[0] = 7;
        p16[1] = 7;
        free(p16);
    }
    

    There's no rule that prohibits storing an uint32_t and then subsequently storing an uint16_t at the same address, so we're perfectly OK.

    Thus there's nothing that would prohibit writing a fully compliant pool allocator.

    0 讨论(0)
提交回复
热议问题