I have seen many times that std::string::operator[]
does not do any bounds checking. Even What is the difference between string::at and string::operator[]?, as
This operator of standard containers emulates the behavior of the operator [] of ordinary arrays. So it does not make any checks. However in the debug mode the corresponding library can provide this checking.
If you want to check the index then use member function at()
instead.
The wording is slightly confusing, but if you study it in detail you'll find that it's actually very precise.
It says this:
[]
is either = n or it's < n.charT()
(i.e. the null character).But no rule is defined for when you break the precondition, and the check for = n can be satisfied implicitly (but isn't explicitly mandated to be) by actually storing a charT()
at position n.
So implementations don't need to perform any bounds checking… and the common ones won't.
operator[]
has do some sort of bounds checking to determine...
No it doesn't. With the precondition
Requires: pos <= size().
it can just ASSUME that it can always return an element of the string. If this condition isn't met: Undefined behaviour.
The operator[]
will likely just increment the pointer from the start of the string by pos. If the string is shorter, well then it just returns a reference to the data behind the string, whatever it might be. Like a classic out of bounds in simple C arrays.
To fullify the case of where pos == size()
it could just have allocated an extra charT
at the end of its internal string data. So just incrementing the pointer without any checks, would still deliver the stated behaviour.
http://en.cppreference.com/w/cpp/string/basic_string/operator_at
Returns a reference to the character at specified location pos. No bounds checking is performed.
(Emphasis mine).
If you want bounds checking, use std::basic_string::at
The standard imply the implementation needs to provide bounds checking because it basically describes what an unchecked array access does.
If you access within bounds, it's defined. If you step outside, you trigger undefined behavior.
First, there is a requires clause. If you violate the requires clause, your program behaves in an undefined manner. That is pos <= size()
.
So the language only defines what happens in that case.
The next paragraph states that for pos < size()
, it returns a reference to an element in the string. And for pos == size()
, it returns a reference to a default constructed charT
with value charT()
.
While this may look like bounds checking, in practice what actually happens is that the std::basic_string
allocates a buffer one larger than asked and populates the last entry with a charT()
. Then []
simply does pointer arithemetic.
I have tried to come up with a way to avoid that implementation. While the standard does not mandate it, I could not convince myself an alternative exists. There was something annoying with .data()
that made it difficult to avoid the single buffer.