std::strings's capacity(), reserve() & resize() functions

后端 未结 6 1471
天命终不由人
天命终不由人 2020-11-27 17:27

I wan to use std::string simply to create a dynamic buffer and than iterate through it using an index. Is resize() the only function to actually allocate the buffer?

相关标签:
6条回答
  • 2020-11-27 18:04

    Isn't that the point to reserve() size so you can access it?

    No, that's the point of resize().

    reserve() only gives to enough room so that future call that leads to increase of the size (e.g. calling push_back()) will be more efficient.

    From your use case it looks like you should use .push_back() instead.

    my_string.reserve( 20 );
    
    for ( parsing_something_else_loop )
    {
        char ch = <business_logic>;
        my_string.push_back(ch);
    }
    

    How is it that the string has the capacity but can't really access it with []?

    Calling .reserve() is like blowing up mountains to give you some free land. The amount of free land is the .capacity(). The land is there but that doesn't mean you can live there. You have to build houses in order to move in. The number of houses is the .size() (= .length()).

    Suppose you are building a city, but after building the 50th you found that there is not enough land, so you need to found another place large enough to fit the 51st house, and then migrate the whole population there. This is extremely inefficient. If you knew you need to build 1000 houses up-front, then you can call

    my_string.reserve(1000);
    

    to get enough land to build 1000 houses, and then you call

    my_string.push_back(ch);
    

    to construct the house with the assignment of ch to this location. The capacity is 1000, but the size is still 1. You may not say

    my_string[16] = 'c';
    

    because the house #16 does not exist yet. You may call

    my_string.resize(20);
    

    to get houses #0 ~ #19 built in one go, which is why

    my_string[i++] = ch;
    

    works fine (as long as 0 ≤ i ≤ 19).

    See also http://en.wikipedia.org/wiki/Dynamic_array.


    For your add-on question,

    .resize() cannot completely replace .reserve(), because (1) you don't always need to use up all allocated spaces, and (2) default construction + copy assignment is a two-step process, which could take more time than constructing directly (esp. for large objects), i.e.

    #include <vector>
    #include <unistd.h>
    
    struct SlowObject
    {
        SlowObject() { sleep(1); }
        SlowObject(const SlowObject& other) { sleep(1); }
        SlowObject& operator=(const SlowObject& other) { sleep(1); return *this; }
    };
    
    int main()
    {
        std::vector<SlowObject> my_vector;
    
        my_vector.resize(3);
        for (int i = 0; i < 3; ++ i)
            my_vector[i] = SlowObject();
    
        return 0;
    }
    

    Will waste you at least 9 seconds to run, while

    int main()
    {
        std::vector<SlowObject> my_vector;
    
        my_vector.reserve(3);
        for (int i = 0; i < 3; ++ i)
            my_vector.push_back(SlowObject());
    
        return 0;
    }
    

    wastes only 6 seconds.

    std::string only copies std::vector's interface here.

    0 讨论(0)
  • 2020-11-27 18:06

    The capacity is the length of the actual buffer, but that buffer is private to the string; in other words, it is not yours to access. The std::string of the standard library may allocate more memory than is required to storing the actual characters of the string. The capacity is the total allocated length. However, accessing characters outside s.begin() and s.end() is still illegal.

    You call reserve in cases when you anticipate resizing of the string to avoid unnecessary re-allocations. For example, if you are planning to concatenate ten 20-character strings in a loop, it may make sense to reserve 201 characters (an extra one is for the zero terminator) for your string, rather than expanding it several times from its default size.

    0 讨论(0)
  • 2020-11-27 18:14

    No -- the point of reserve is to prevent re-allocation. resize sets the usable size, reserve does not -- it just sets an amount of space that's reserved, but not yet directly usable.

    Here's one example -- we're going to create a 1000-character random string:

    static const int size = 1000;
    std::string x;
    x.reserve(size);
    for (int i=0; i<size; i++)
       x.push_back((char)rand());
    

    reserve is primarily an optimization tool though -- most code that works with reserve should also work (just, possibly, a little more slowly) without calling reserve. The one exception to that is that reserve can ensure that iterators remain valid, when they wouldn't without the call to reserve.

    0 讨论(0)
  • 2020-11-27 18:16

    std::vector instead of std::string might also be a solution - if there are no requirements against it.

    vector<char> v; // empty vector
    vector<char> v(10); // vector with space for 10 elements, here char's
    

    Your example:

    vector<char> my_string(20);
    
    int i=0;
    
    for ( parsing_something_else_loop )
    {
        char ch = <business_logic>;
        my_string[i++] = ch;
    }
    
    0 讨论(0)
  • 2020-11-27 18:19

    reserve(n) indeed allocates enough storage to hold at least n elements, but it doesn't actually fill the container with any elements. The string is still empty (has size 0), but you are guaranteed, that you can add (e.g. through push_back or insert) at least n elements before the string's internal buffer needs to be reallocated, whereas resize(n) really resizes the string to contain n elements (and deletes or adds new elements if neccessary).

    So reserve is actually a mere optimization facility, when you know you are adding a bunch of elements to the container (e.g. in a push_back loop) and don't want it to reallocate the storage too often, which incurs memory allocation and copying costs. But it doesn't change the outside/client view of the string. It still stays empty (or keeps its current element count).

    Likewise capacity returns the number of elements the string can hold until it needs to reallocate its internal storage, whereas size (and for string also length) returns the actual number of elements in the string.

    0 讨论(0)
  • 2020-11-27 18:24

    Just because reserve allocates additional space does not mean it is legitimate for you to access it.

    In your example, either use resize, or rewrite it to something like this:

    string my_string;
    
    // I want my string to have 20 bytes long buffer
    my_string.reserve( 20 );
    
    int i = 0;
    
    for ( parsing_something_else_loop )
    {
        char ch = <business_logic>;
    
        // store the character in 
        my_string += ch;
    }
    
    0 讨论(0)
提交回复
热议问题