问题
Consider:
std::vector<double> u, v;
#pragma omp parallel for
for (std::size_t i = 0u; i < u.size(); ++i)
u[i] += v[i];
To express similar code with the C++17 parallel algorithms, the solution I found so far is to use the two input ranges version of std::transform
:
std::transform(std::execution::par_unseq,
std::begin(u), std::end(u), std::begin(v), std::begin(u),
std::plus())
which I don't like at all because it bypasses the +=
operator of my types and in my real use case leads to much more verbose (4 times longer) code than the original OpenMP code (I cannot just use std::plus
because I have to first make an operation on the RHS range elements).
Is there another algorithm that I oversight?
Note also that if I use ranges::zip
the code won't run in parallel in GCC 9 because if iterator_category
is not at least forward_iterator
the PSTL back-end falls back to the sequential algorithm: https://godbolt.org/z/XGtPwc.
回答1:
Have you tried tbb::zip_iterator (https://www.threadingbuildingblocks.org/docs/help/reference/iterators/zip_iterator.html)?
Its iterator_category
is random_access_iterator
.
So the code will look like
auto zip_begin = tbb::make_zip_iterator(std::begin(u), std::begin(v));
std::for_each(par_unseq, zip_begin, zip_begin + u.size(),
[](auto &&x) { std::get<0u>(x) += std::get<1u>(x); });
来源:https://stackoverflow.com/questions/56128455/parallel-algorithm-to-sum-assign-the-elements-of-a-vector-to-the-elements-of-ano