After C++11, I thought of c_str()
and data()
equivalently.
C++17 introduces an overload for the latter, that returns a non-constant pointer (reference, which I am not sure if it's updated completely w.r.t. C++17):
const CharT* data() const; (1)
CharT* data(); (2) (since C++17)
c_str()
does only return a constant pointer:
const CharT* c_str() const;
Why the differentiation of these two methods in C++17, especially when C++11 was the one that made them homogeneous? In other words, why only the one method got an overload, while the other didn't?
The new overload was added by P0272R1 for C++17. Neither the paper itself nor the links therein discuss why only data
was given new overloads but c_str
was not. We can only speculate at this point (unless people involved in the discussion chime in), but I'd like to offer the following points for consideration:
Even just adding the overload to
data
broke some code; keeping this change conservative was a way to minimize negative impact.The
c_str
function had so far been entirely identical todata
and is effectively a "legacy" facility for interfacing code that takes "C string", i.e. an immutable, null-terminated char array. Since you can always replacec_str
bydata
, there's no particular reason to add to this legacy interface.
I realize that the very motivation for P0292R1 was that there do exist legacy APIs that erroneously or for C reasons take only mutable pointers even though they don't mutate. All the same, I suppose we don't want to add more to string's already massive API that absolutely necessary.
One more point: as of C++17 you are now allowed to write to the null terminator, as long as you write the value zero. (Previously, it used to be UB to write anything to the null terminator.) A mutable c_str
would create yet another entry point into this particular subtlety, and the fewer subtleties we have, the better.
The reason why the data()
member got an overload is explained in this paper at open-std.org.
TL;DR of the paper: The non-const .data()
member function for std::string
was added to improve uniformity in the standard library and to help C++ developers write correct code. It is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters.
Some relevant passages from the paper:
Abstract
Isstd::string
's lack of a non-const.data()
member function an oversight or an intentional design based on pre-C++11std::string
semantics? In either case, this lack of functionality tempts developers to use unsafe alternatives in several legitimate scenarios. This paper argues for the addition of a non-const.data()
member function for std::string to improve uniformity in the standard library and to help C++ developers write correct code.Use Cases
C libraries occasionally include routines that have char * parameters. One example is thelpCommandLine
parameter of theCreateProcess
function in the Windows API. Because thedata()
member ofstd::string
is const, it cannot be used to make std::string objects work with thelpCommandLine
parameter. Developers are tempted to use.front()
instead, as in the following example.std::string programName; // ... if( CreateProcess( NULL, &programName.front(), /* etc. */ ) ) { // etc. } else { // handle error }
Note that when
programName
is empty, theprogramName.front()
expression causes undefined behavior. A temporary empty C-string fixes the bug.std::string programName; // ... if( !programName.empty() ) { char emptyString[] = {'\0'}; if( CreateProcess( NULL, programName.empty() ? emptyString : &programName.front(), /* etc. */ ) ) { // etc. } else { // handle error } }
If there were a non-const
.data()
member, as there is withstd::vector
, the correct code would be straightforward.std::string programName; // ... if( !programName.empty() ) { char emptyString[] = {'\0'}; if( CreateProcess( NULL, programName.data(), /* etc. */ ) ) { // etc. } else { // handle error } }
A non-const
.data() std::string
member function is also convenient when calling a C-library function that doesn't have const qualification on its C-string parameters. This is common in older codes and those that need to be portable with older C compilers.
It just depends on the semantics of "what you want to do with it". Generally speaking, std::string
is sometimes used as a buffer vector, i.e., as a replacement to std::vector<char>
. This can be seen in boost::asio
often. In other words, it's an array of characters.
c_str()
: strictly means that you're looking for a null-terminated string. In that sense, you should never modify the data and you should never need the string as a non-const.
data()
: you may need the information inside the string as buffer data, and even as non-const. You may or may not need to modify the data, which you can do, as long as it doesn't involve changing the length of the string.
The two member functions c_str and data of std::string exist due to the history of the std::string class.
Until C++11, a std::string could have been implemented as copy-on-write. The internal representation did not need any null termination of the stored string. The member function c_str made sure the returned string was null terminated. The member function data simlpy returned a pointer to the stored string, that was not necessarily null terminated. - To be sure that changes to the string were noticed to enable copy-on-write, both functions needed to return a pointer to const data.
This all changed with C++11 when copy-on-write was no longer allowed for std::string. Since c_str was still required to deliver a null terminated string, the null is always appended to the actual stored string. Otherwise a call to c_str may need to change the stored data to make the string null terminated which would make c_str a non-const function. Since data delivers a pointer to the stored string, it usually has the same implementation as c_str. Both functions still exists due to backward compatibility.
来源:https://stackoverflow.com/questions/53500369/c-str-vs-data-when-it-comes-to-return-type