Why is modifying a string through a retrieved pointer to its data not allowed?

前端 未结 1 1728
自闭症患者
自闭症患者 2020-11-28 10:53

In C++11, the characters of a std::string have to be stored contiguously, as § 21.4.1/5 points out:

The char-like objects in a basic_stri

相关标签:
1条回答
  • 2020-11-28 11:34

    Why can't we write directly to this buffer?

    I'll state the obvious point: because it's const. And casting away a const value and then modifying that data is... rude.

    Now, why is it const? That goes back to the days when copy-on-write was considered a good idea, so std::basic_string had to allow implementations to support it. It would be very useful to get an immutable pointer to the string (for passing to C-APIs, for example) without incurring the overhead of a copy. So c_str needed to return a const pointer.

    As for why it's still const? Well... that goes to an oddball thing in the standard: the null terminator.

    This is legitimate code:

    std::string stupid;
    const char *pointless = stupid.c_str();
    

    pointless must be a NUL-terminated string. Specifically, it must be a pointer to a NUL character. So where does the NUL character come from? There are a couple of ways for a std::string implementation to allow this to work:

    1. Use small-string optimization, which is a common technique. In this scheme, every std::string implementation has an internal buffer it can use for a single NUL character.
    2. Return a pointer to static memory, containing a NUL character. Therefore, every std::string implementation will return the same pointer if it's an empty string.

    Everyone shouldn't be forced to implement SSO. So the standards committee needed a way to keep #2 on the table. And part of that is giving you a const string from c_str(). And since this memory is likely real const, not fake "Please don't modify this memory const," giving you a mutable pointer to it is a bad idea.

    Of course, you can still get such a pointer by doing &str[0], but the standard is very clear that modifying the NUL terminator is a bad idea.

    Now, that being said, it is perfectly valid to modify the &str[0] pointer, and the array of characters therein. So long as you stay in the half-open range [0, str.size()). You just can't do it through the pointer returned by data or c_str. Yes, even though the standard in fact requires str.c_str() == &str[0] to be true.

    That's standardese for you.

    0 讨论(0)
提交回复
热议问题