Does Iterator::collect allocate the same amount of memory as String::with_capacity?

ぐ巨炮叔叔 提交于 2020-06-12 06:47:11

问题


In C++ when joining a bunch of strings (where each element's size is known roughly), it's common to pre-allocate memory to avoid multiple re-allocations and moves:

std::vector<std::string> words;
constexpr size_t APPROX_SIZE = 20;

std::string phrase;
phrase.reserve((words.size() + 5) * APPROX_SIZE);  // <-- avoid multiple allocations
for (const auto &w : words)
  phrase.append(w);

Similarly, I did this in Rust (this chunk needs the unicode-segmentation crate)

fn reverse(input: &str) -> String {
    let mut result = String::with_capacity(input.len());
    for gc in input.graphemes(true /*extended*/).rev() {
        result.push_str(gc)
    }
    result
}

I was told that the idiomatic way of doing it is a single expression

fn reverse(input: &str) -> String {
  input
      .graphemes(true /*extended*/)
      .rev()
      .collect::<Vec<&str>>()
      .concat()
}

While I really like it and want to use it, from a memory allocation point of view, would the former allocate less chunks than the latter?

I disassembled this with cargo rustc --release -- --emit asm -C "llvm-args=-x86-asm-syntax=intel" but it doesn't have source code interspersed, so I'm at a loss.


回答1:


Your original code is fine and I do not recommend changing it.

The original version allocates once: inside String::with_capacity.

The second version allocates at least twice: first, it creates a Vec<&str> and grows it by pushing &strs onto it. Then, it counts the total size of all the &strs and creates a new String with the correct size. (The code for this is in the join_generic_copy method in str.rs.) This is bad for several reasons:

  1. It allocates unnecessarily, obviously.
  2. Grapheme clusters can be arbitrarily large, so the intermediate Vec can't be usefully sized in advance -- it just starts at size 1 and grows from there.
  3. For typical strings, it allocates way more space than would actually be needed just to store the end result, because &str is usually 16 bytes in size while a UTF-8 grapheme cluster is typically much less than that.
  4. It wastes time iterating over the intermediate Vec to get the final size where you could just take it from the original &str.

On top of all this, I wouldn't even consider this version idiomatic, because it collects into a temporary Vec in order to iterate over it, instead of just collecting the original iterator, as you had in an earlier version of your answer. This version fixes problem #3 and makes #4 irrelevant but doesn't satisfactorily address #2:

input.graphemes(true).rev().collect()

collect uses FromIterator for String, which will try to use the lower bound of the size_hint from the Iterator implementation for Graphemes. However, as I mentioned earlier, extended grapheme clusters can be arbitrarily long, so the lower bound can't be any greater than 1. Worse, &strs may be empty, so FromIterator<&str> for String doesn't know anything about the size of the result in bytes. This code just creates an empty String and calls push_str on it repeatedly.

Which, to be clear, is not bad! String has a growth strategy that guarantees amortized O(1) insertion, so if you have mostly tiny strings that won't need to be reallocated often, or you don't believe the cost of allocation is a bottleneck, using collect::<String>() here may be justified if you find it more readable and easier to reason about.

Let's go back to your original code.

let mut result = String::with_capacity(input.len());
for gc in input.graphemes(true).rev() {
    result.push_str(gc);
}

This is idiomatic. collect is also idiomatic, but all collect does is basically the above, with a less accurate initial capacity. Since collect doesn't do what you want, it's not unidiomatic to write the code yourself.

There is a slightly more concise, iterator-y version that still makes only one allocation. Use the extend method, which is part of Extend<&str> for String:

fn reverse(input: &str) -> String {
    let mut result = String::with_capacity(input.len());
    result.extend(input.graphemes(true).rev());
    result
}

I have a vague feeling that extend is nicer, but both of these are perfectly idiomatic ways of writing the same code. You should not rewrite it to use collect, unless you feel that expresses the intent better and you don't care about the extra allocation.

Related

  • Efficiency of flattening and collecting slices


来源:https://stackoverflow.com/questions/58611262/does-iteratorcollect-allocate-the-same-amount-of-memory-as-stringwith-capaci

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!