How to implement memmove in standard C without an intermediate copy?

前提是你 提交于 2019-11-26 19:56:09

I think you're right, it's not possible to implement memmove efficiently in standard C.

The only truly portable way to test whether the regions overlap, I think, is something like this:

[Edit: looking at this much later, I think it should be dst+len-1 at the end of the second line. But I can't be bothered to test it so I'll leave as it is for now, it's possible I knew what I was talking about the first time.]

for (size_t l = 0; l < len; ++l) {
    if (src + l == dst) || (src + l == dst + len) {
      // they overlap, so now we can use comparison,
      // and copy forwards or backwards as appropriate.
      ...
      return dst;
    }
}
// No overlap, doesn't matter which direction we copy
return memcpy(dst, src, len);

You can't implement either memcpy or memmove all that efficiently in portable code, because the platform-specific implementation is likely to kick your butt whatever you do. But a portable memcpy at least looks plausible.

C++ introduced a pointer specialization of std::less, which is defined to work for any two pointers of the same type. It might in theory be slower than <, but obviously on a non-segmented architecture it isn't.

C has no such thing, so in a sense, the C++ standard agrees with you that C doesn't have enough defined behaviour. But then, C++ needs it for std::map and so on. It's much more likely that you'd want to implement std::map (or something like it) without knowledge of the implementation than that you'd want to implement memmove (or something like it) without knowledge of the implementation.

For two memory areas to be valid and overlapping, I believe you would need to be in one of the defined situations of 6.5.8.5. That is, two areas of an array, union, struct, etc.

The reason other situations are undefined are because two different objects might not even be in the same kind of memory, with the same kind of pointer. On PC architectures, addresses are usually just 32-bit address into virtual memory, but C supports all kinds of bizarre architectures, where memory is nothing like that.

The reason that C leaves things undefined is to give leeway to the compiler writers when the situation doesn't need to be defined. The way to read 6.5.8.5 is a paragraph carefully describing architectures that C wants to support where pointer comparison doesn't make sense unless it's inside the same object.

Also, the reason memmove and memcpy are provided by the compiler is that they are sometimes written in tuned assembly for the target CPU, using a specialized instruction. They are not meant to be able to be implemented in C with the same efficiency.

For starters, the C standard is notorious for having problems in the details like this. Part of the problem is because C is used on multiple platforms and the standard attempts to be abstract enough to cover all current and future platforms (which might use some convoluted memory layout that's beyond anything we've ever seen). There is a lot of undefined or implementation-specific behavior in order for compiler writers to "do the right thing" for the target platform. Including details for every platform would be impractical (and constantly out-of-date); instead, the C standard leaves it up to the compiler writer to document what happens in these cases. "Unspecified" behavior only means that the C standard doesn't specify what happens, not necessarily that the outcome cannot be predicted. The outcome is usually still predictable if you read the documentation for your target platform and your compiler.

Since determining if two pointers point to the same block, memory segment, or address space depends on how the memory for that platform is laid out, the spec does not define a way to make that determination. It assumes that the compiler knows how to make this determination. The part of the spec you quoted said that result of pointer comparison depends on the pointers' "relative location in the address space". Notice that "address space" is singular here. This section is only referring to pointers that are in the same address space; that is, pointers that are directly comparable. If the pointers are in different address spaces, then the result is undefined by the C standard and is instead defined by the requirements of the target platform.

In the case of memmove, the implementor generally determines first if the addresses are directly comparable. If not, then the rest of the function is platform-specific. Most of the time, being in different memory spaces is enough to ensure that the regions don't overlap and the function turns into a memcpy. If the addresses are directly comparable, then it's just a simple byte copy process starting from the first byte and going forward or from the last byte and going backwards (whichever one will safely copy the data without clobbering anything).

All in all, the C standard leaves a lot intentionally unspecified where it can't write a simple rule that works on any target platform. However, the standard writers could have done a better job explaining why some things are not defined and used more descriptive terms like "architecture-dependent".

Here's another idea, but I don't know if it's correct. To avoid the O(len) loop in Steve's answer, one could put it in the #else clause of an #ifdef UINTPTR_MAX with the cast-to-uintptr_t implementation. Provided that cast of unsigned char * to uintptr_t commutes with adding integer offsets whenever the offset is valid with the pointer, this makes the pointer comparison well-defined.

I'm not sure whether this commutativity is defined by the standard, but it would make sense, as it works even if only the lower bits of a pointer are an actual numeric address and the upper bits are some sort of black box.

I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C

But it's not proof. There's absolutely no way to guarantee that you can compare two arbitrary pointers on an arbitrary machine architecture. The behaviour of such a pointer comparison cannot be legislated by the C standard or even a compiler. I could imagine a machine with a segmented architecture that might produce a different result depending on how the segments are organised in RAM or might even choose to throw an exception when pointers into different segments are compared. This is why the behaviour is "undefined". The exact same program on the exact same machine might give different results from run to run.

The oft given "solution" of memmove() using the relationship of the two pointers to choose whether to copy from the beginning to the end or from the end to the beginning only works if all memory blocks are allocated from the same address space. Fortunately, this is usually the case although it wasn't in the days of 16 bit x86 code.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!