问题
In C++ when joining a bunch of strings (where each element's size is known roughly), it's common to pre-allocate memory to avoid multiple re-allocations and moves:
std::vector<std::string> words;
constexpr size_t APPROX_SIZE = 20;
std::string phrase;
phrase.reserve((words.size() + 5) * APPROX_SIZE); // <-- avoid multiple allocations
for (const auto &w : words)
phrase.append(w);
Similarly, I did this in Rust (this chunk needs the unicode-segmentation crate)
fn reverse(input: &str) -> String {
let mut result = String::with_capacity(input.len());
for gc in input.graphemes(true /*extended*/).rev() {
result.push_str(gc)
}
result
}
I was told that the idiomatic way of doing it is a single expression
fn reverse(input: &str) -> String {
input
.graphemes(true /*extended*/)
.rev()
.collect::<Vec<&str>>()
.concat()
}
While I really like it and want to use it, from a memory allocation point of view, would the former allocate less chunks than the latter?
I disassembled this with cargo rustc --release -- --emit asm -C "llvm-args=-x86-asm-syntax=intel"
but it doesn't have source code interspersed, so I'm at a loss.
回答1:
Your original code is fine and I do not recommend changing it.
The original version allocates once: inside String::with_capacity
.
The second version allocates at least twice: first, it creates a Vec<&str>
and grows it by push
ing &str
s onto it. Then, it counts the total size of all the &str
s and creates a new String
with the correct size. (The code for this is in the join_generic_copy method in str.rs.) This is bad for several reasons:
- It allocates unnecessarily, obviously.
- Grapheme clusters can be arbitrarily large, so the intermediate
Vec
can't be usefully sized in advance -- it just starts at size 1 and grows from there. - For typical strings, it allocates way more space than would actually be needed just to store the end result, because
&str
is usually 16 bytes in size while a UTF-8 grapheme cluster is typically much less than that. - It wastes time iterating over the intermediate
Vec
to get the final size where you could just take it from the original&str
.
On top of all this, I wouldn't even consider this version idiomatic, because it collect
s into a temporary Vec
in order to iterate over it, instead of just collect
ing the original iterator, as you had in an earlier version of your answer. This version fixes problem #3 and makes #4 irrelevant but doesn't satisfactorily address #2:
input.graphemes(true).rev().collect()
collect
uses FromIterator
for String
, which will try to use the lower bound of the size_hint
from the Iterator
implementation for Graphemes
. However, as I mentioned earlier, extended grapheme clusters can be arbitrarily long, so the lower bound can't be any greater than 1. Worse, &str
s may be empty, so FromIterator<&str>
for String
doesn't know anything about the size of the result in bytes. This code just creates an empty String
and calls push_str
on it repeatedly.
Which, to be clear, is not bad! String
has a growth strategy that guarantees amortized O(1) insertion, so if you have mostly tiny strings that won't need to be reallocated often, or you don't believe the cost of allocation is a bottleneck, using collect::<String>()
here may be justified if you find it more readable and easier to reason about.
Let's go back to your original code.
let mut result = String::with_capacity(input.len());
for gc in input.graphemes(true).rev() {
result.push_str(gc);
}
This is idiomatic. collect
is also idiomatic, but all collect
does is basically the above, with a less accurate initial capacity. Since collect
doesn't do what you want, it's not unidiomatic to write the code yourself.
There is a slightly more concise, iterator-y version that still makes only one allocation. Use the extend
method, which is part of Extend<&str>
for String
:
fn reverse(input: &str) -> String {
let mut result = String::with_capacity(input.len());
result.extend(input.graphemes(true).rev());
result
}
I have a vague feeling that extend
is nicer, but both of these are perfectly idiomatic ways of writing the same code. You should not rewrite it to use collect
, unless you feel that expresses the intent better and you don't care about the extra allocation.
Related
- Efficiency of flattening and collecting slices
来源:https://stackoverflow.com/questions/58611262/does-iteratorcollect-allocate-the-same-amount-of-memory-as-stringwith-capaci