While learning c I have implemented my own memcpy functions. I have used a wider type( uint32_t
) in the function. (For simplicity the function is restricted to
The way to implement memcpy
using more than single-byte copies is to use non-standard C.
Standard C does not support implementing memcpy
using other than character types.
Quality C implementations provide an optimized memcpy
implementation that performs efficient copying using more than single-byte copies, but they use implementation-specific code to do so. They may do this by compiling the memcpy
implementation with a switch such as -fnostrict-aliasing
to tell the compiler the aliasing rules will be violated in the code, by relying on known features of the specific C implementation to ensure the code will work (if you write the compiler, you can design it so that your implementation of memcpy
works), or by writing memcpy
in assembly language.
Additionally, C implementations may optimize memcpy
calls where they appear in source code, replacing them by direct instructions to perform the operation or by simply changing the internal semantics of the program. (E.g., if you copy a
into b
, the compiler might not perform a copy at all but might simply load from a
where subsequent code accesses b
.)
To implement your own specialized copy operation while violating aliasing rules, compile it with -fnostrict-aliasing
, if you are using GCC or Clang. If you are using another compiler, check its documentation for an option to disable the aliasing rules. (Note: Apple’s GCC, which I use, disables strict aliasing by default and accepts -fstrict-aliasing
but not -fnostrict-aliasing
. I am presuming non-Apple GCC accepts -fnostrict-aliasing
.)
If you are using a good C implementation, you may find that your four-byte-copy implementation of memcpy4
does not perform as well as the native memcpy
, depending on circumstances.