Erasing from vector by swapping to the end

帅比萌擦擦* 提交于 2021-01-28 13:41:36

问题


I am wondering why there is no STL function that deletes an element from a vector by swapping it to the end and then removing it. Is there a better data structure than std::vector if you don't care about the actual order of elements and still want very fast traversal?

Note that using std::remove and std::erase is not the same because this is linear time instead of constant time as the order has to be preserved.


回答1:


You can use std::partition instead of std::remove/std::remove_if:

Keep only values greater than 3:

std::vector foo{1,2,3,4,5,6,7,8};

foo.erase(std::partition(foo.begin(),
                         foo.end(),
                         [](auto& v) { return v > 3; }
                        ),
          foo.end());

Or you could make it similar to the std::erase / std::erase_if (std::vector) pair that was added in C++20 and return the number of erased elements. I call them unstable__erase and unstable__erase_if.

// Erases all elements that compare equal to value
template<class T, class Alloc, class U>
[[maybe_unused]] constexpr typename std::vector<T,Alloc>::size_type
unstable_erase(std::vector<T,Alloc>& c, const U& value) {
    return unstable_erase_if(c, [&value](auto& v) { return v == value; });
}

// Erases all elements that satisfy the predicate pred
template<class T, class Alloc, class Pred>
[[maybe_unused]] constexpr typename std::vector<T,Alloc>::size_type
unstable_erase_if(std::vector<T,Alloc>& c, Pred pred) {
    using size_type = typename std::vector<T,Alloc>::size_type;

    auto p = std::partition(c.begin(), c.end(), std::not_fn(pred));
    auto count = static_cast<size_type>(std::distance(p, c.end()));
    c.resize(c.size() - count);

    return count;
}

Since std::partition swaps elements I was expecting to be able to improve on the speed by replacing the above

    auto p = std::partition(c.begin(), c.end(), std::not_fn(pred));

with

    auto p = unstable_remove_if(c.begin(), c.end(), pred);

where I've defined unstable_remove_if as below:

template<class ForwardIt, class UnaryPredicate>
constexpr ForwardIt
unstable_remove_if(ForwardIt first, ForwardIt last, UnaryPredicate p) {
    for(;first != last; ++first) {
        if(p(*first)) { // found one that should be removed

            // find a "last" that shoud NOT be removed
            while(true) {
                if(--last == first) return last;
                if(not p(*last)) break;          // should not be removed
            }
            *first = std::move(*last);           // move last to first
        }
    }
    return last;
}

but, to my surprise, running that in https://quick-bench.com/ showed that they performed equally fast for fundamental types (up to long double).

Edit: For larger types, it showed a big improvement (8-16 times as fast for 32 - 64 byte types), so using the unstable_remove_if in unstable_erase_if (std::vector) is probably the way to go for a generic solution of removing elements matching a certain predicate or value.




回答2:


What you want doesn't exist in the standard library but you can write a reusable function for it. This also works for other containers like std::string or std::deque.

template <typename Container>
void unstable_erase(Container &container, typename Container::size_type pos)
{
    container[pos] = std::move(container.back());
    container.pop_back();
}

template <typename Container>
void unstable_erase(Container &container, typename Container::iterator itr)
{
    *itr = std::move(container.back());
    container.pop_back();
}

I stole Ted Lyngmo's function name, it's better than what I had.




回答3:


There isn't. Vectors are cache-friendly. If by traversal you mean access all elements, no other data structure will give as much speed as vector.

The issue with cache is: once you access one element, a few other elements in the same block of memory gets cached also, so when you need to access the next one it will be already in your cache (this will happen many times, for instance: you access one element that is not in your cache and the next 31 or 63[the value will depend on your cache size] gets loaded as a bonus).

If you use something like a set, that is not contiguous, you will lose performance due to a lot of cache misses.




回答4:


#include <vector>

std::vector<int> foo = {0, 1, 2, 3, 4};

// remove element foo[2] without preserving order in constant time
foo[2] = foo.back();
foo.pop_back();



回答5:


If you know what number you need to delete, you can use a set for a logarithmic time complexity. First, you can use the s.lower_bound(x) function that returns an iterator to the first element >= x in the set s. After, you can do s.erase(it), where it is the iterator. Both of these functions have a time complexity of O(log n), where n is the size of the set.



来源:https://stackoverflow.com/questions/65893596/erasing-from-vector-by-swapping-to-the-end

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!