问题
Assume I have vector of strings and I want concatenate them via std::accumulate.
If I use the following code:
std::vector<std::string> foo{"foo","bar"};
string res="";
res=std::accumulate(foo.begin(),foo.end(),res,
[](string &rs,string &arg){ return rs+arg; });
I can be pretty sure there will be temporary object construction.
In this answer they say that the effect of std::accumulate is specified this way:
Computes its result by initializing the accumulator acc with the initial value init and then modifies it with acc = acc + *i or acc = binary_op(acc, *i) for every iterator i in the range [first,last) in order.
So I'm wondering what is the correct way to do this to avoid the unnecessary temporary object construction.
One idea was to change the lambda this way:
[](string &rs,string &arg){ rs+=arg; return rs; }
In this case, I thought I force efficient concatenation of the strings and help the compiler (I know I shouldn't) omit the unnecessary copy, since this should be equivalent to (pseudocode):
accum = [](& accum,& arg){ ...; return accum; }
and thus
accum = & accum;
Another idea was to use
accum = [](& accum,& arg){ ...; return std::move(accum); }
But this would probably lead to something like:
accum = std::move(& accum);
Which looks very suspicious to me.
What is the correct way to write this to minimize the risk of the unnecessary creation of temporary objects? I'm not just interested in std::string, I'd be happy to have a solution, that would probably work for any object that has copy and move constructors/assignments implemented.
回答1:
Try the following
res=std::accumulate(foo.begin(),foo.end(),res,
[](string &rs, const string &arg) -> string & { return rs+=arg; });
Before this call maybe there is a sence to call
std::string::size_type n = std::accumulate( foo.begin(), foo.end(),
std::string::size_type( 0 ),
[] ( std::string_size_type n, const std::string &s ) { return ( n += s.size() ); } );
res.reserve( n );
回答2:
I would break this into two operations, first std::accumulate
to obtain the total length of the string that needs to be created, then a std::for_each
with a lambda that updates the local string:
std::string::size_type total = std::accumulate(foo.begin(), foo.end(), 0u,
[](std::string::size_type c, std::string const& s) {
return c+s.size()
});
std::string result;
result.reserve(total);
std::for_each(foo.begin(), foo.end(),
[&](std::string const& s) { result += s; });
The common alternative to this is using expression templates, but that does not fit in an answer. Basically you create a data structure that maps the operations, but does not execute them. When the expression is finally evaluated, it can gather the information it needs upfront and use that to reserve the space and do the copies. The code that uses the expression template is nicer, but more complicated.
回答3:
Using std::accumulate
efficiently without any redundant copies is not obvious.
In addition to being reassigned and passed into and out-of the lambda, the accumulating value may get copied internally by the implementation.
Also, note that std::accumulate() itself takes the initial value by-value, calling a copy-ctor and thus, ignoring any reserve()
s done on the source of the copy (as suggested in some of the other answers).
The most efficient way I found to concatenate the strings is as follows:
std::vector<std::string> str_vec{"foo","bar"};
// get reserve size:
auto sz = std::accumulate(str_vec.cbegin(), str_vec.cend(), std::string::size_type(0), [](int sz, auto const& str) { return sz + str.size() + 1; });
std::string res;
res.reserve(sz);
std::accumulate(str_vec.cbegin(), str_vec.cend(),
std::ref(res), // use a ref wrapper to keep same object with capacity
[](std::string& a, std::string const& b) -> std::string& // must specify return type because cannot return `std::reference_wrapper<std::string>`.
{ // can't use `auto&` args for the same reason
a += b;
return a;
});
The result will be in res
.
This implementation has no redundant copies, moves or reallocations.
回答4:
This is a bit tricky, since there are two operations involved, the addition and the assignment. In order to avoid the copies, you have to both modify the string in the addition, and ensure that the assignment is a no-op. It's the second part which is tricky.
What I've done on occasions is to create a custom "accumulator", along the lines of:
class Accu
{
std::string myCollector;
enum DummyToSuppressAsgn { dummy };
public:
Accu( std::string const& startingValue = std::string() )
: myCollector( startingValue )
{
}
// Default copy ctor and copy asgn are OK.
// On the other hand, we need the following special operators
Accu& operator=( DummyToSuppressAsgn )
{
// Don't do anything...
return *this;
}
DummyToSuppressAsgn operator+( std::string const& other )
{
myCollector += other;
return dummy;
}
// And to get the final results...
operator std::string() const
{
return myCollector;
}
};
There'll be a few copies when calling accumulate
, and of the
return value, but during the actual accumulation, nothing. Just
invoke:
std::string results = std::accumulate( foo.begin(), foo.end(), Accu() );
(If you're really concerned about performance, you can add
a capacity argument to the constructor of Accu
, so that it can
do a reserve
on the member string. If I did this, I'd
probably hand write the copy constructor as well, to ensure that
the string in the copied object had the required capacity.)
来源:https://stackoverflow.com/questions/19664196/efficient-accumulate