问题
I am confused about the following code:
string _str = "SDFDFSD";
char* pStr = (char*)_str.data();
for (int i = 0; i < iSize; i++)
pStr[i] = ::tolower(pStr[i]);
here _str.data()
returns const char*
. But we are assigning it to a char*
. My questions is,
_str.data()
is returning pointer to a constant data. How is it possible to store it in a pointer to data? The data was constant right? If we assign it to char pointer than we can change it like we are doing inside the for statement which should not be possible for a constant data.
回答1:
What you are doing is not valid at the standard library level (you're violating std::string contract
) but valid at the C++ core language level.
The char *
returned from data
should not be written to because for example it could be in theory(*) shared between different strings with the same value.
If you want to modify a string just use std::string::operator[]
that will inform the object of the intention and will take care of creating a private buffer for the specific instance in case the string was originally shared instead.
Technically you are allowed to cast-away const-ness from a pointer or a reference, but if it's a valid operation or not depends on the semantic of the specific case. The reason for which the operation is allowed is that the main philosophy of C++ is that programmers make no mistakes and know what they are doing. For example is technically legal from a C++ language point of view to do memcpy(&x, "hello", 5)
where x
is a class instance, but the results are most probably "undefined behavior".
If you think that your code "works" it's because you've the wrong understanding of what "works" really should mean (hint: "works" doesn't mean that someone once observed the code doing what seemed reasonable, but that will work in all cases). A valid C++ implementation is free to do anything it wants if you run that program: that you observed something you think is fine doesn't really mean anything, may be you didn't look close enough, or may be you were just lucky (unfortunate, actually) that no crash happened right away.
(*) In modern times the COW (copy-on-write) implementations of std::string are low in popularity because they pose a lot of problems (e.g. with multithreading) and memory is a lot cheaper now. Still std::string
contract says you're not allowed to change the memory pointed by the return value of data()
; if you do anything may happen.
回答2:
Don't do that. It may be fine in this case, but as the documentation for data()
says:
The pointer returned may be invalidated by further calls to other member functions that modify the object.
A program shall not alter any of the characters in this sequence.
So you could very accidentally write to invalid memory if you keep that pointer around. Or, in fact, ruin the implementation of std::string. I would almost go as far as to say that this function shouldn't be exposed.
std::string offers a non-const operator[]
for that purpose.
string _str = "SDFDFSD";
for (int i = 0; i < iSize; i++)
_str[i] = ::tolower(_str[i]);
回答3:
You never must change the data returned from std::string::data()
or std::string::c_str()
directly.
To create a copy of a std::string
:
std::string str1 = "test";
std::string str2 = str1; // copy.
Change characters in a string:
std::string str1 = "test"
str1[0] = 'T';
回答4:
The "correct" way would be to use std::transform instead:
std::transform(_str.begin(), _str.end(), _str.begin(), ::tolower);
回答5:
The simple answer to your question is that in C++ you can cast away the 'const' of a variable.
You probably shouldn't though.
See this for const correctness in C++
回答6:
String always allocates memory on heap, so this is not actually const data, it is just marked so (in method data() signature) to prevent modification.
But nothing is impossible in C++, so with a simple cast, though unsafe, you can now treat the same memory space as modifiable.
回答7:
All constants in a C/C++ program (like "SDFDFSD"
below)will be stored in a separate section .rodata
. This section is mapped as read-only when the binary is loaded into memory during execution.
int main()
{
char* ptr = "SDFDFSD";
ptr[0]='x'; //segmentation fault!!
return 0;
}
Hence any attempt to modify the data at that location will result in a run-time error i.e. a segmentation fault.
Coming to the above question, when creating a string and assigning a string to it, a new copy in memory now exists (memory used to hold the properties of the string object _str
). This is on the heap and NOT mapped to a read-only section. The member function _str.data()
points to the location in memory which is mapped read/write.
The
const
qualifier to the return type ensure that this function is NOT accidentally passed to string manipulation functions which expect a non-constchar*
pointer.
In your current iteration there was no limitation on the memory location itself that was holding the string object's data; i.e. it was mapped with both read/write permissions. Hence modifying the location using another non-const pointer worked i.e. pStr[i]
on the left hand side of an assignment did NOT result in a run-time error as there were no inherent restrictions on the memory location itself.
Again this is NOT guaranteed to work and just a implementation specific behaviour that you have observed (i.e. it simply happens to work for you) and cannot always depend on this.
来源:https://stackoverflow.com/questions/22858001/returning-const-char-to-char-and-then-changing-the-data