问题
I was trying to demonstrate to a work pal that you can change the value of a constant-qualified variable if really wants to (and knows how to) by using some trickery, during my demostration, I've discovered that exists two "flavours" of constant values: the ones that you cannot change whatever you do, and the ones that you can change by using dirty tricks.
A constant value is unchangeable when the compiler uses the literal value instead of the value stored on the stack (readed here), here is a piece of code that shows what I mean:
// TEST 1
#define LOG(index, cv, ncv) std::cout \
<< std::dec << index << ".- Address = " \
<< std::hex << &cv << "\tValue = " << cv << '\n' \
<< std::dec << index << ".- Address = " \
<< std::hex << &ncv << "\tValue = " << ncv << '\n'
const unsigned int const_value = 0xcafe01e;
// Try with no-const reference
unsigned int &no_const_ref = const_cast<unsigned int &>(const_value);
no_const_ref = 0xfabada;
LOG(1, const_value, no_const_ref);
// Try with no-const pointer
unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value);
*no_const_ptr = 0xb0bada;
LOG(2, const_value, (*no_const_ptr));
// Try with c-style cast
no_const_ptr = (unsigned int *)&const_value;
*no_const_ptr = 0xdeda1;
LOG(3, const_value, (*no_const_ptr));
// Try with memcpy
unsigned int brute_force = 0xba51c;
std::memcpy(no_const_ptr, &brute_force, sizeof(const_value));
LOG(4, const_value, (*no_const_ptr));
// Try with union
union bad_idea
{
const unsigned int *const_ptr;
unsigned int *no_const_ptr;
} u;
u.const_ptr = &const_value;
*u.no_const_ptr = 0xbeb1da;
LOG(5, const_value, (*u.no_const_ptr));
This produces the following output:
1.- Address = 0xbfffbe2c Value = cafe01e
1.- Address = 0xbfffbe2c Value = fabada
2.- Address = 0xbfffbe2c Value = cafe01e
2.- Address = 0xbfffbe2c Value = b0bada
3.- Address = 0xbfffbe2c Value = cafe01e
3.- Address = 0xbfffbe2c Value = deda1
4.- Address = 0xbfffbe2c Value = cafe01e
4.- Address = 0xbfffbe2c Value = ba51c
5.- Address = 0xbfffbe2c Value = cafe01e
5.- Address = 0xbfffbe2c Value = beb1da
Since I'm relying in a UB (change the value of const data) is expected that the program acts weird; but this weirdness is more than I was expecting.
Let's supose that the compiler is using the literal value, then, when the code reach the instruction to change the value of the constant (by reference, pointer or memcpy
ing), simply ignores the order as long as the value is a literal (is undefined behaviour though). This explains why the value remains unchanged but:
- Why is the same memory address in both variables but the contained value differs?
AFAIK the same memory address cannot point to different values, so, one of the outputs is lying:
- What's really happening? Which memory address is the fake one (if any)?
Making a few changes on the code above we can try to avoid the use of the literal value, so the trickery would do its work (source here):
// TEST 2
// Try with no-const reference
void change_with_no_const_ref(const unsigned int &const_value)
{
unsigned int &no_const_ref = const_cast<unsigned int &>(const_value);
no_const_ref = 0xfabada;
LOG(1, const_value, no_const_ref);
}
// Try with no-const pointer
void change_with_no_const_ptr(const unsigned int &const_value)
{
unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value);
*no_const_ptr = 0xb0bada;
LOG(2, const_value, (*no_const_ptr));
}
// Try with c-style cast
void change_with_cstyle_cast(const unsigned int &const_value)
{
unsigned int *no_const_ptr = (unsigned int *)&const_value;
*no_const_ptr = 0xdeda1;
LOG(3, const_value, (*no_const_ptr));
}
// Try with memcpy
void change_with_memcpy(const unsigned int &const_value)
{
unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value);
unsigned int brute_force = 0xba51c;
std::memcpy(no_const_ptr, &brute_force, sizeof(const_value));
LOG(4, const_value, (*no_const_ptr));
}
void change_with_union(const unsigned int &const_value)
{
// Try with union
union bad_idea
{
const unsigned int *const_ptr;
unsigned int *no_const_ptr;
} u;
u.const_ptr = &const_value;
*u.no_const_ptr = 0xbeb1da;
LOG(5, const_value, (*u.no_const_ptr));
}
int main(int argc, char **argv)
{
unsigned int value = 0xcafe01e;
change_with_no_const_ref(value);
change_with_no_const_ptr(value);
change_with_cstyle_cast(value);
change_with_memcpy(value);
change_with_union(value);
return 0;
}
Which produces the following output:
1.- Address = 0xbff0f5dc Value = fabada
1.- Address = 0xbff0f5dc Value = fabada
2.- Address = 0xbff0f5dc Value = b0bada
2.- Address = 0xbff0f5dc Value = b0bada
3.- Address = 0xbff0f5dc Value = deda1
3.- Address = 0xbff0f5dc Value = deda1
4.- Address = 0xbff0f5dc Value = ba51c
4.- Address = 0xbff0f5dc Value = ba51c
5.- Address = 0xbff0f5dc Value = beb1da
5.- Address = 0xbff0f5dc Value = beb1da
As we can see, the const-qualified variable was changed on each change_with_*
call, and the behaviour is the same as before except for this fact, so I was tempted to assume that the weird behaviour of the memory address manifests when the const data is used as literal instead of value.
So, in order to ensure this assumption, I've made a last test, changing the unsigned int value
in main
to const unsigned int value
:
// TEST 3
const unsigned int value = 0xcafe01e;
change_with_no_const_ref(value);
change_with_no_const_ptr(value);
change_with_cstyle_cast(value);
change_with_memcpy(value);
change_with_union(value);
Surprisingly the output is the same as TEST 2
(code here), so I suppose that the data is passed as variable not as literal value due to its usage as parameter, so this makes me wonder:
- What things make the compiler to decide to optimize a const value as literal value?
In brief, my questions are:
- In
TEST 1
.- Why the const value and the no-const value shares the same memory address but its contained value differs?
- What steps follows the program to produce this output? Which memory address is the fake one (if any)?
- In
TEST 3
- What things make the compiler to decide to optimize a const value as literal value?
回答1:
In general, it is pointless to analyse Undefined Behaviour, because there is no guarantee that you can transfer the results of your analysis to a different program.
In this case, the behaviour can be explained by assuming the compiler has applied the optimisation technique called constant propagation. In that technique, if you use the value of a const
variable for which the compiler knows the value, then the compiler replaces the use of the const
variable with the value of that variable (as it is known at compile time). Other uses of the variable, such as taking its address, are not replaced.
This optimisation is valid, precisely because changing a variable that was defined as const
results in Undefined Behaviour and the compiler is allowed to assume a program does not invoke undefined behaviour.
So, in TEST 1
, the addresses are the same, because it is all the same variable, but the values differ because the first of each pair reflects what the compiler presumes (rightly) to be the value of the variable and the second reflects what is actually stored there.
In TEST 2
and TEST 3
, the compiler can't make the optimisation, because the compiler can't be 100% sure that the function argument will refer to a constant value (and in TEST 2
, it doesn't).
来源:https://stackoverflow.com/questions/16668656/explanation-of-the-ub-while-changing-data