问题
Howard Chu writes:
In the latest C spec it is impossible to write a "legal" implementation of malloc or memcpy.
Is this right? My impression is that in the past, the intent (at least) of the standard was that something like this would work:
void * memcpy(void * restrict destination, const void * restrict source, size_t nbytes)
{
size_t i;
unsigned char *dst = (unsigned char *) destination;
const unsigned char *src = (const unsigned char *) source;
for (i = 0; i < nbytes; i++)
dst[i] = src[i];
return destination;
}
What rules in the latest C standard are violated here? Or, what part of the specification of memcpy
is not correctly implemented by this code?
回答1:
For the malloc
function, paragraph 6.5 §6 makes it clear that it is not possible to write a conformant and portable C implementation :
The effective type of an object for an access to its stored value is the declared type of the object, if any(87)...
The (non normative) note 87 says:
Allocated objects have no declared type.
The only way to declare a object with no declared type is... through the allocation function which is required to return such an object! So inside the allocation function, you must have something that cannot be allowed by the standard to setup a memory zone with no declared type.
In common implementations, the standard library malloc and free are indeed implemented in C, but the system knows about it and assumes that the character array which has been provided inside malloc
just has no declared type. Full stop.
But the remaining part of the same paragraph explains that there is no real problem in writing a memcpy
implementation (emphasize mine):
... If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.
Provided you copy the object as an array of character type, which is a special access allowed per the strict aliasing rule, there is no problem in implementing memcpy
, and your code is a possible and valid implementation.
IMHO the rant of Howard Chu is about that old good memcpy
usage, which is no longer valid (assuming sizeof(float) == sizeof(int)
):
float f = 1.0;
int i;
memcpy(&i, &f, sizeof(int)); // valid: copy at byte level, but the value of i is undefined
print("Repr of %f is %x\n", i, i); // UB: i cannot be accessed as a float
回答2:
TL;DR
It should be fine, as long as the memcpy
is based on naive character-by-character copy.
And not optimized to move chunks of the size of the largest aligned type that can be copied in a single instruction. The latter is how standard lib implementations do it.
What's concerning is something like this scenario:
void* my_int = malloc(sizeof *my_int);
int another_int = 1;
my_memcpy(my_int, &another_int, sizeof(int));
printf("%d", *(int*)my_int); // well-defined or strict aliasing violation?
Explanation:
- The data pointed at my
my_int
has no effective type. - When we copy the data into the
my_int
location, one might be concerned that we force the effective type to becomeunsigned char
, since that's whatmy_memcpy
uses. - And then when we read that memory location through
int*
. Would we violate strict aliasing?
However, the key here is a special exception in the rule for effective type, specified in C17 6.5/6, emphasis mine:
If a value is copied into an object having no declared type using
memcpy
ormemmove
, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.
Since we do copy the array as character type, the effective type of what my_int
points at will become that of the object another_int
from which the value was copied.
So everything should be fine.
In addition, you restrict
-qualified the parameters so there should be no fuss regarding if the two pointers might alias each other, just like real memcpy
.
Notably, this rule has remained the same through C99, C11 and C17. One might argue that it is a very bad rule abused by compiler vendors, but that's another story.
来源:https://stackoverflow.com/questions/54336811/is-it-technically-impossible-to-implement-memcpy-from-scratch-in-standard-c