Is it technically impossible to implement memcpy from scratch in Standard C?

霸气de小男生 提交于 2021-02-07 05:11:26

问题


Howard Chu writes:

In the latest C spec it is impossible to write a "legal" implementation of malloc or memcpy.

Is this right? My impression is that in the past, the intent (at least) of the standard was that something like this would work:

void * memcpy(void * restrict destination, const void * restrict source, size_t nbytes)
{
    size_t i;
    unsigned char *dst = (unsigned char *) destination;
    const unsigned char *src = (const unsigned char *) source;

    for (i = 0; i < nbytes; i++)
        dst[i] = src[i];
    return destination;
}

What rules in the latest C standard are violated here? Or, what part of the specification of memcpy is not correctly implemented by this code?


回答1:


For the malloc function, paragraph 6.5 §6 makes it clear that it is not possible to write a conformant and portable C implementation :

The effective type of an object for an access to its stored value is the declared type of the object, if any(87)...

The (non normative) note 87 says:

Allocated objects have no declared type.

The only way to declare a object with no declared type is... through the allocation function which is required to return such an object! So inside the allocation function, you must have something that cannot be allowed by the standard to setup a memory zone with no declared type.

In common implementations, the standard library malloc and free are indeed implemented in C, but the system knows about it and assumes that the character array which has been provided inside malloc just has no declared type. Full stop.

But the remaining part of the same paragraph explains that there is no real problem in writing a memcpy implementation (emphasize mine):

... If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

Provided you copy the object as an array of character type, which is a special access allowed per the strict aliasing rule, there is no problem in implementing memcpy, and your code is a possible and valid implementation.

IMHO the rant of Howard Chu is about that old good memcpy usage, which is no longer valid (assuming sizeof(float) == sizeof(int)):

float f = 1.0;
int i;
memcpy(&i, &f, sizeof(int));         // valid: copy at byte level, but the value of i is undefined
print("Repr of %f is %x\n", i, i);   // UB: i cannot be accessed as a float



回答2:


TL;DR
It should be fine, as long as the memcpy is based on naive character-by-character copy.

And not optimized to move chunks of the size of the largest aligned type that can be copied in a single instruction. The latter is how standard lib implementations do it.


What's concerning is something like this scenario:

void* my_int = malloc(sizeof *my_int);
int another_int = 1;

my_memcpy(my_int, &another_int, sizeof(int));

printf("%d", *(int*)my_int); // well-defined or strict aliasing violation?

Explanation:

  • The data pointed at my my_int has no effective type.
  • When we copy the data into the my_int location, one might be concerned that we force the effective type to become unsigned char, since that's what my_memcpy uses.
  • And then when we read that memory location through int*. Would we violate strict aliasing?

However, the key here is a special exception in the rule for effective type, specified in C17 6.5/6, emphasis mine:

If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

Since we do copy the array as character type, the effective type of what my_int points at will become that of the object another_int from which the value was copied.

So everything should be fine.

In addition, you restrict-qualified the parameters so there should be no fuss regarding if the two pointers might alias each other, just like real memcpy.

Notably, this rule has remained the same through C99, C11 and C17. One might argue that it is a very bad rule abused by compiler vendors, but that's another story.



来源:https://stackoverflow.com/questions/54336811/is-it-technically-impossible-to-implement-memcpy-from-scratch-in-standard-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!