Does the C Standard Allow for Self-Modifying Code?

Is self-modifying code possible in a portable manner in C?

The reason I ask is that, in a way, OOP relies on self-modifying code (because the code that executes at run-time is actually generated as data, e.g. in a v-table), and yet, it seems that, if this is taken too far, it would prevent most optimizations in a compiler.

For example:

void add(char *restrict p, char *restrict pAddend, int len)
{
    for (int i = 0; i < len; i++)
        p[i] += *pAddend;
}

An optimizing compiler could hoist the *pAddend out of the loop, because it wouldn't interfere with p. However, this is no longer a valid optimization in self-modifying code.

In this way, it seems that C doesn't allow for self-modifying code, but at the same time, wouldn't that imply that you can't do some things like OOP in C? Does C really support self-modifying code?

Self-modifying code is not possible in C for many reasons, the most important of which are:

The code generated by the compiler is completely up to the compiler, and might not look anything like what the programmer trying to write code that modifies itself expects. This is a fundamental problem with doing SMC at all, not just a portability problem.
Function and data pointers are completely separate in C; the language provides no way to convert back and forth between them. This issue is not fundamental, since some implementations or higher-level standards (POSIX) guarantee that code and data pointers share a representation.

Aside from that, self-modifying code is just a really really bad idea. 20 years ago it might have had some uses, but nowadays it will result in nothing but bugs, atrocious performance, and portability failures. Note that on some ISAs, whether the instruction cache even sees changes that were made to cached code might be unspecified/unpredictable!

Finally, vtables have nothing to do with self-modifying code. It's purely a matter of modifying function pointers, which are data, not code.

Strictly speaking, self-modifying code cannot be implemented in a portable manner in C or C++ if I understood the standard correctly.

Self modifying code in C/C++ would mean something like this:

uint8_t code_buffer[FUNCTION_SIZE];
void call_function(void)
{
   ... modify code_buffer here to the machine code we'd like to run.
   ((void (*)(void))code_buffer)();
}

This is not legal and will crash on most modern architectures. This is impossible to implement on Harvard architectures as executable code is strictly read-only, so it cannot be part of any standard.

Most modern OSes do have a facility to be able to do this hackery, which is used by dynamic recompilers for one. mprotect() in Unix for example.

来源：https://stackoverflow.com/questions/6399003/does-the-c-standard-allow-for-self-modifying-code

标签

self-modifying