Determining the unique rows of a 2D array (vector >)

前端 未结 2 1012
Happy的楠姐
Happy的楠姐 2021-01-14 14:42

I am using a datatype of std::vector > to store a 2D matrix/array. I would like to determine the unique rows of this matrix. I am l

相关标签:
2条回答
  • 2021-01-14 15:18

    EDIT: I forgot std::vector already defines operator< and operator== so you need not even use that:

    template <typename t>
    std::vector<std::vector<t> > GetUniqueRows(std::vector<std::vector<t> > input)
    {
        std::sort(input.begin(), input.end());
        input.erase(std::unique(input.begin(), input.end()), input.end());
        return input;
    }
    

    Use std::unique in concert with a custom functor which calls std::equal on the two vectors.

    std::unique requires that the input be sorted first. Use a custom functor calling std::lexicographical_compare on the two vectors input. If you need to recover the unreordered output, you'll need to store the existing order somehow. This will achieve M*n log n complexity for the sort operation (where M is the length of the inner vectors, n is the number of inner vectors), while the std::unique call will take m*n time.

    For comparison, both your existing approaches are m*n^2 time.

    EDIT: Example:

    template <typename t>
    struct VectorEqual : std::binary_function<const std::vector<t>&, const std::vector<t>&, bool>
    {
        bool operator()(const std::vector<t>& lhs, const std::vector<t>& rhs)
        {
            if (lhs.size() != rhs.size()) return false;
            return std::equal(lhs.first(), lhs.second(), rhs.first());
        }
    };
    
    template <typename t>
    struct VectorLess : std::binary_function<const std::vector<t>&, const std::vector<t>&, bool>
    {
        bool operator()(const std::vector<t>& lhs, const std::vector<t>& rhs)
        {
            return std::lexicographical_compare(lhs.first(), lhs.second(), rhs.first(), rhs.second());
        }
    };
    
    template <typename t>
    std::vector<std::vector<t> > GetUniqueRows(std::vector<std::vector<t> > input)
    {
        std::sort(input.begin(), input.end(), VectorLess<t>());
        input.erase(std::unique(input.begin(), input.end(), VectorEqual<t>()), input.end());
        return input;
    }
    

    0 讨论(0)
  • 2021-01-14 15:20

    You should also consider using hashing, it preserves row ordering and could be faster (amortized O(m*n) if alteration of the original is permitted, O(2*m*n) if a copy is required) than sort/unique -- especially noticeable for large matrices (on small matrices you are probably better off with Billy's solution since his requires no additional memory allocation to keep track of the hashes.)

    Anyway, taking advantage of Boost.Unordered, here's what you can do:

    #include <vector>
    #include <boost/foreach.hpp>
    #include <boost/ref.hpp>
    #include <boost/typeof/typeof.hpp>
    #include <boost/unordered_set.hpp>
    
    namespace boost {
      template< typename T >
      size_t hash_value(const boost::reference_wrapper< T >& v) {
        return boost::hash_value(v.get());
      }
      template< typename T >
      bool operator==(const boost::reference_wrapper< T >& lhs, const boost::reference_wrapper< T >& rhs) {
        return lhs.get() == rhs.get();
      }
    }
    
    // destructive, but fast if the original copy is no longer required
    template <typename T>
    void uniqueRows_inplace(std::vector<std::vector<T> >& A)
    {
      boost::unordered_set< boost::reference_wrapper< std::vector< T > const > > unique(A.size());
      for (BOOST_AUTO(it, A.begin()); it != A.end(); ) {
        if (unique.insert(boost::cref(*it)).second) {
          ++it;
        } else {
          A.erase(it);
        }
      }
    }
    
    // returning a copy (extra copying cost)
    template <typename T>
    void uniqueRows_copy(const std::vector<std::vector<T> > &A,
                     std::vector< std::vector< T > > &ret)
    {
      ret.reserve(A.size());
      boost::unordered_set< boost::reference_wrapper< std::vector< T > const > > unique;
      BOOST_FOREACH(const std::vector< T >& row, A) {
        if (unique.insert(boost::cref(row)).second) {
          ret.push_back(row);
        }
      }
    }
    
    0 讨论(0)
提交回复
热议问题