Is the code below safe? It might be tempting to write code akin to this:
#include
No, the C++ standard makes no such guarantees.
That said, if the code is in the same translation unit then it would be difficult to find a counter example. If main()
is in a different translation then a counter example might be easier to produce.
If the map is in a different dynamic linked library or shared object then it's almost certainly not the case.
The volatile
qualifier is a red herring.
The Standard does not guarantee the addresses of string literals with the same content will be the same. In fact, [lex.string]/16 says:
Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
The second part even says you might not get the same address when a function containing a string literal is called a second time! Though I've never seen a compiler do that.
So using the same character array object when a string literal is repeated is an optional compiler optimization. With my installation of g++ and default compiler flags, I also find I get the same address for two identical string literals in the same translation unit. But as you guessed, I get different ones if the same string literal content appears in different translation units.
A related interesting point: it's also permitted for different string literals to use overlapping arrays. That is, given
const char* abcdef = "abcdef";
const char* def = "def";
const char* def0gh = "def\0gh";
it's possible you might find abcdef+3
, def
, and def0gh
are all the same pointer.
Also, this rule about reusing or overlapping string literal objects applies only to the unnamed array object directly associated with the literal, used if the literal immediately decays to a pointer or is bound to a reference to array. A literal can also be used to initialize a named array, as in
const char a1[] = "XYZ";
const char a2[] = "XYZ";
const char a3[] = "Z";
Here the array objects a1
, a2
and a3
are initialized using the literal, but are considered distinct from the actual literal storage (if such storage even exists) and follow the ordinary object rules, so the storage for those arrays will not overlap.
Whether or not two string literals with the exact same content are the exact same object, is unspecified, and in my opinion best not relied upon. To quote the standard:
[lex.string]
16 Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above. Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.
If you wish to avoid the overhead of std::string
, you can write a simple view type (or use std::string_view
in C++17) that is a reference type over a string literal. Use it to do intelligent comparisons instead of relying upon literal identity.
The C++ standard does not require an implementation to de-duplicate string literals.
When a string literal resides in another translation unit or another shared library that would require the linker (ld
) or runtime-linker (ld.so
) to do the string literal de-duplication. Which they don't.