Why does Rust have String
and str
? What are the differences between String
and str
? When does one use String
They are actually completely different. First off, a str
is nothing but a type level thing; it can only be reasoned about at the type level because it's a so-called dynamically-sized type (DST). The size the str
takes up cannot be known at compile time and depends on runtime information — it cannot be stored in a variable because the compiler needs to know at compile time what the size of each variable is. A str
is conceptually just a row of u8
bytes with the guarantee that it forms valid UTF-8. How large is the row? No one knows until runtime hence it can't be stored in a variable.
The interesting thing is that a &str
or any other pointer to a str
like Box<str>
does exist at runtime. This is a so-called "fat pointer"; it's a pointer with extra information (in this case the size of the thing it's pointing at) so it's twice as large. In fact, a &str
is quite close to a String
(but not to a &String
). A &str
is two words; one pointer to a the first byte of a str
and another number that describes how many bytes long the the str
is.
Contrary to what is said, a str
does not need to be immutable. If you can get a &mut str
as an exclusive pointer to the str
, you can mutate it and all the safe functions that mutate it guarantee that the UTF-8 constraint is upheld because if that is violated then we have undefined behaviour as the library assumes this constraint is true and does not check for it.
So what is a String
? That's three words; two are the same as for &str
but it adds a third word which is the capacity of the str
buffer on the heap, always on the heap (a str
is not necessarily on the heap) it manages before it's filled and has to re-allocate. the String
basically owns a str
as they say; it controls it and can resize it and reallocate it when it sees fit. So a String
is as said closer to a &str
than to a str
.
Another thing is a Box<str>
; this also owns a str
and its runtime representation is the same as a &str
but it also owns the str
unlike the &str
but it cannot resize it because it does not know its capacity so basically a Box<str>
can be seen as a fixed-length String
that cannot be resized (you can always convert it into a String
if you want to resize it).
A very similar relationship exists between [T]
and Vec<T>
except there is no UTF-8 constraint and it can hold any type whose size is not dynamic.
The use of str
on the type level is mostly to create generic abstractions with &str
; it exists on the type level to be able to conveniently write traits. In theory str
as a type thing didn't need to exist and only &str
but that would mean a lot of extra code would have to be written that can now be generic.
&str
is super useful to be able to to have multiple different substrings of a String
without having to copy; as said a String
owns the str
on the heap it manages and if you could only create a substring of a String
with a new String
it would have to copied because everything in Rust can only have one single owner to deal with memory safety. So for instance you can slice a string:
let string: String = "a string".to_string();
let substring1: &str = &string[1..3];
let substring2: &str = &string[2..4];
We have two different substring str
s of the same string. string
is the one that owns the actual full str
buffer on the heap and the &str
substrings are just fat pointers to that buffer on the heap.
I have a C++ background and I found it very useful to think about String
and &str
in C++ terms:
String
is like a std::string
; it owns the memory and does the dirty job of managing memory.&str
is like a char*
(but a little more sophisticated); it points us to the beginning of a chunk in the same way you can get a pointer to the contents of std::string
.Are either of them going to disappear? I do not think so. They serve two purposes:
String
keeps the buffer and is very practical to use. &str
is lightweight and should be used to "look" into strings. You can search, split, parse, and even replace chunks without needing to allocate new memory.
&str
can look inside of a String
as it can point to some string literal. The following code needs to copy the literal string into the String
managed memory:
let a: String = "hello rust".into();
The following code lets you use the literal itself without copy (read only though)
let a: &str = "hello rust";
For C# and Java people:
String
=== StringBuilder
&str
=== (immutable) stringI like to think of a &str
as a view on a string, like an interned string in Java / C# where you can't change it, only create a new one.