Is it possible to write a generic variadic zipWith in C++?

后端 未结 2 1789
傲寒
傲寒 2021-02-06 09:03

I want a generic zipWith function in C++ of variable arity. I have two problems. The first is that I cannot determine the type of the function pointer passed to zipWith. It must

相关标签:
2条回答
  • 2021-02-06 09:18

    Here is what I cobbled together:

    #include <iostream>
    #include <vector>
    #include <utility>
    
    template<typename F, typename T, typename Arg>
    auto fold(F f, T&& t, Arg&& a) 
      -> decltype(f(std::forward<T>(t), std::forward<Arg>(a)))
    { return f(std::forward<T>(t), std::forward<Arg>(a)); }
    
    template<typename F, typename T, typename Head, typename... Args>
    auto fold(F f, T&& init, Head&& h, Args&&... args) 
      -> decltype(f(std::forward<T>(init), std::forward<Head>(h)))
    { 
      return fold(f, f(std::forward<T>(init), std::forward<Head>(h)), 
                  std::forward<Args>(args)...); 
    }
    
    // hack in a fold for void functions
    struct ignore {};
    
    // cannot be a lambda, needs to be polymorphic on the iterator type
    struct end_or {
      template<typename InputIterator>
      bool operator()(bool in, const std::pair<InputIterator, InputIterator>& p) 
        { return in || p.first == p.second; }
    };
    
    // same same but different
    struct inc {
      template<typename InputIterator>
      ignore operator()(ignore, std::pair<InputIterator, InputIterator>& p) 
        { p.first++; return ignore(); }
    };
    
    template<typename Fun, typename OutputIterator, 
             typename... InputIterators>
    void zipWith(Fun f, OutputIterator out, 
                 std::pair<InputIterators, InputIterators>... inputs) {
      if(fold(end_or(), false, inputs...)) return;
      while(!fold(end_or(), false, inputs...)) {
        *out++ = f( *(inputs.first)... );
        fold(inc(), ignore(), inputs...);
      }
    }
    
    template<typename Fun, typename OutputIterator, 
             typename InputIterator, typename... Rest>
    void transformV(Fun f, OutputIterator out, InputIterator begin, InputIterator end,
                    Rest... rest) 
    {
      if(begin == end) return ;
      while(begin != end) {
        *out++ = f(*begin, *(rest)... );
        fold(inc2(), ignore(), begin, rest...);
      }
    }
    
    struct ternary_plus {
      template<typename T, typename U, typename V>
      auto operator()(const T& t, const U& u, const V& v) 
        -> decltype( t + u + v) // common type? 
        { return t + u + v; }
    };
    
    int main()
    {
      using namespace std;
      vector<int> a = {1, 2, 3}, b = {1, 2}, c = {1, 2, 3};
      vector<int> out;
    
      zipWith(ternary_plus(), back_inserter(out)
              , make_pair(begin(a), end(a))
              , make_pair(begin(b), end(b))
              , make_pair(begin(c), end(c)));
    
      transformV(ternary_plus(), back_inserter(out),
                 begin(a), end(a), begin(b), begin(c));
    
      for(auto x : out) { 
        std::cout << x << std::endl;
      }
    
      return 0;
    }
    

    This is a slightly improved variant over previous versions. As every good program should, it starts by defining a left-fold.

    It still does not solve the problem of iterators packed in pairs.

    In stdlib terms this function would be called transform and would require that only the length of one sequence is specified and the others be at least as long. I called it transformV here to avoid name clashes.

    0 讨论(0)
  • 2021-02-06 09:32

    I had a long answer, then I changed my mind in a way that made the solution much shorter. But I'm going to show my thought process and give you both answers!

    My first step is to determine the proper signature. I don't understand all of it, but you can treat a parameter pack as a comma-separated list of the actual items with the text-dump hidden. You can extend the list on either side by more comma-separated items! So directly applying that:

    template <typename R, typename T, typename... Vargs>
    std::vector<R> zipWith (R func(T,Vargs...), std::vector<T> first, Vargs rest) {
       ???
    }
    

    You have to put a "..." after a parameter pack for an expression section to see the expanded list. You have to put one in the regular parameter portion, too:

    template <typename R, typename T, typename... Vargs>
    std::vector<R> zipWith (R func(T,Vargs...), std::vector<T> first, Vargs... rest) {
       ???
    }
    

    You said that your function parameters are a bunch of vectors. Here, you're hoping that each of Vargs is really a std::vector. Type transformations can be applied to a parameter pack, so why don't we ensure that you have vectors:

    template <typename R, typename T, typename... Vargs>
    std::vector<R> zipWith (R func(T,Vargs...), std::vector<T> first, std::vector<Vargs> ...rest) {
       ???
    }
    

    Vectors can be huge objects, so let's use const l-value references. Also, we could use std::function so we can use lambda or std::bind expressions:

    template <typename R, typename T, typename... Vargs>
    std::vector<R> zipWith (std::function<R(T, Vargs...)> func, std::vector<T> const &first, std::vector<Vargs> const &...rest) {
       ???
    }
    

    (I ran into problems here from using std::pow for testing. My compiler wouldn't accept a classic function pointer being converted into a std::function object. So I had to wrap it in a lambda. Maybe I should ask here about that....)

    At this point, I reloaded the page and saw one response (by pmr). I don't really understand this zipping, folding, exploding, whatever stuff, so I thought his/her solution was too complicated. So I thought about a more direct solution:

    template < typename R, typename T, typename ...MoreTs >
    std::vector<R>
    zip_with( std::function<R(T,MoreTs...)> func,
     const std::vector<T>& first, const std::vector<MoreTs>& ...rest )
    {
        auto const      tuples = rearrange_vectors( first, rest... );
        std::vector<R>  result;
    
        result.reserve( tuples.size() );
        for ( auto const &x : tuples )
            result.push_back( evaluate(x, func) );
        return result;
    }
    

    I would create a vector of tuples, where each tuple was made from plucking corresponding elements from each vector. Then I would create a vector of evaluation results from passing a tuple and func each time.

    The rearrange_vectors has to make table of values in advance (default-constructed) and fill out each entry a sub-object at a time:

    template < typename T, typename ...MoreTs >
    std::vector<std::tuple<T, MoreTs...>>
    rearrange_vectors( const std::vector<T>& first,
     const std::vector<MoreTs>& ...rest )
    {
        decltype(rearrange_vectors(first, rest...))
          result( first.size() );
    
        fill_vector_perpendicularly<0>( result, first, rest... );
        return result;
    }
    

    The first part of the first line lets the function access its own return type without copy-and-paste. The only caveat is that r-value reference parameters must be surrounded by std::forward (or move) so a l-value overload of the recursive call doesn't get chosen by mistake. The function that mutates part of each tuple element has to explicitly take the current index. The index moves up by one during parameter pack peeling:

    template < std::size_t, typename ...U >
    void  fill_vector_perpendicularly( std::vector<std::tuple<U...>>& )
    { }
    
    template < std::size_t I, class Seq, class ...MoreSeqs, typename ...U >
    void  fill_vector_perpendicularly( std::vector<std::tuple<U...>>&
     table, const Seq& first, const MoreSeqs& ...rest )
    {
        auto        t = table.begin();
        auto const  te = table.end();
    
        for ( auto  f = first.begin(), fe = first.end(); (te != t) && (fe
         != f) ; ++t, ++f )
            std::get<I>( *t ) = *f;
        table.erase( t, te );
        fill_vector_perpendicularly<I + 1u>( table, rest... );
    }
    

    The table is as long as the shortest input vector, so we have to trim the table whenever the current input vector ends first. (I wish I could mark fe as const within the for block.) I originally had first and rest as std::vector, but I realized I could abstract that out; all I need are types that match the standard (sequence) containers in iteration interface. But now I'm stumped on evaluate:

    template < typename R, typename T, typename ...MoreTs >
    R  evaluate( const std::tuple<T, MoreTs...>& x,
     std::function<R(T,MoreTs...)> func )
    {
         //???
    }
    

    I can do individual cases:

    template < typename R >
    R  evaluate( const std::tuple<>& x, std::function<R()> func )
    { return func(); }
    
    template < typename R, typename T >
    R  evaluate( const std::tuple<T>& x, std::function<R(T)> func )
    { return func( std::get<0>(x) ); }
    

    but I can't generalize it for a recursive case. IIUC, std::tuple doesn't support peeling off the tail (and/or head) as a sub-tuple. Nor does std::bind support currying arguments into a function in piecemeal, and its placeholder system isn't compatible with arbitrary-length parameter packs. I wish I could just list each parameter like I could if I had access to the original input vectors....

    ...Wait, why don't I do just that?!...

    ...Well, I never heard of it. I've seen transferring a template parameter pack to the function parameters; I just showed it in zipWith. Can I do it from the function parameter list to the function's internals? (As I'm writing, I now remember seeing it in the member-initialization part of class constructors, for non-static members that are arrays or class types.) Only one way to find out:

    template < typename R, typename T, typename ...MoreTs >
    std::vector<R>
    zip_with( std::function<R(T,MoreTs...)> func, const std::vector<T>&
     first, const std::vector<MoreTs>& ...rest )
    {
        auto const  s = minimum_common_size( first, rest... );
        decltype(zip_with(func,first,rest...))         result;
    
        result.reserve( s );
        for ( std::size_t  i = 0 ; i < s ; ++i )
            result.push_back( func(first[i], rest[i]...) );
        return result;
    }
    

    where I'm forced to compute the total number of calls beforehand:

    inline  std::size_t minimum_common_size()  { return 0u; }
    
    template < class SizedSequence >
    std::size_t  minimum_common_size( const SizedSequence& first )
    { return first.size(); }
    
    template < class Seq, class ...MoreSeqs >
    std::size_t
    minimum_common_size( const Seq& first, const MoreSeqs& ...rest )
    { return std::min( first.size(), minimum_common_size(rest...) ); }
    

    and sure enough, it worked! Of course, this meant that I over-thought the problem just as bad as the other respondent (in a different way). It also means that I unnecessarily bored you with most of this post. As I wrapped this up, I realized that the replacement of std::vector with generic sequence-container types can be applied in zip_width. And I realized that I could reduce the mandatory one vector to no mandatory vectors:

    template < typename R, typename ...T, class ...SizedSequences >
    std::vector<R>
    zip_with( R func(T...) /*std::function<R(T...)> func*/,
     SizedSequences const& ...containers )
    {
        static_assert( sizeof...(T) == sizeof...(SizedSequences),
         "The input and processing lengths don't match." );
    
        auto const  s = minimum_common_size( containers... );
        decltype( zip_with(func, containers...) )     result;
    
        result.reserve( s );
        for ( std::size_t  i = 0 ; i < s ; ++i )
            result.push_back( func(containers[i]...) );
        return result;
    }
    

    I added the static_assert as I copied the code here, since I forgot to make sure that the func's argument count and the number of input vectors agree. Now I realize that I can fix the dueling function-pointer vs. std::function object by abstracting both away:

    template < typename R, typename Func, class ...SizedSequences >
    std::vector<R>
    zip_with( Func&& func, SizedSequences&& ...containers )
    {
        auto const     s = minimum_common_size( containers... );
        decltype( zip_with<R>(std::forward<Func>(func),
         std::forward<SizedSequences>(containers)...) )  result;
    
        result.reserve( s );
        for ( std::size_t  i = 0 ; i < s ; ++i )
            result.push_back( func(containers[i]...) );
        return result;
    }
    

    Marking a function parameter with an r-value reference is the universal passing method. It handles all kinds of references and const/volatile (cv) qualifications. That's why I switched containers to it. The func could have any structure; it can even be a class object with multiple versions of operator (). Since I'm using r-values for the containers, they'll use the best cv-qualification for element dereferencing, and the function can use that for overload resolution. The recursive "call" to internally determine the result type needs to use std::forward to prevent any "downgrades" to l-value references. It also reveals a flaw in this iteration: I must provide the return type.

    I'll fix that, but first I want to explain the STL way. You do not pre-determine a specific container type and return that to the user. You ask for a special object, an output-iterator, that you send the results to. The iterator could be connected to a container, of which the standard provides several varieties. It could be connected to an output stream instead, directly printing the results! The iterator method also relieves me from directly worrying about memory concerns.

    #include <algorithm>
    #include <cstddef>
    #include <iterator>
    #include <utility>
    #include <vector>
    
    inline  std::size_t minimum_common_size()  { return 0u; }
    
    template < class SizedSequence >
    std::size_t  minimum_common_size( const SizedSequence& first )
    { return first.size(); }
    
    template < class Seq, class ...MoreSeqs >
    std::size_t  minimum_common_size( const Seq& first,
     const MoreSeqs& ...rest )
    {
        return std::min<std::size_t>( first.size(),
         minimum_common_size(rest...) );
    }
    
    template < typename OutIter, typename Func, class ...SizedSequences >
    OutIter
    zip_with( OutIter o, Func&& func, SizedSequences&& ...containers )
    {
        auto const  s = minimum_common_size( containers... );
    
        for ( std::size_t  i = 0 ; i < s ; ++i )
            *o++ = func( containers[i]... );
        return o;
    }
    
    template < typename Func, class ...SizedSequences >
    auto  zipWith( Func&& func, SizedSequences&& ...containers )
     -> std::vector<decltype( func(containers.front()...) )>
    {
        using std::forward;
    
        decltype( zipWith(forward<Func>( func ), forward<SizedSequences>(
         containers )...) )  result;
    #if 1
        // `std::vector` is the only standard container with the `reserve`
        // member function.  Using it saves time when doing multiple small
        // inserts, since you'll do reallocation at most (hopefully) once.
        // The cost is that `s` is already computed within `zip_with`, but
        // we can't get at it.  (Remember that most container types
        // wouldn't need it.)  Change the preprocessor flag to change the
        // trade-off.
        result.reserve( minimum_common_size(containers...) );
    #endif
        zip_with( std::back_inserter(result), forward<Func>(func),
         forward<SizedSequences>(containers)... );
        return result;
    }
    

    I copied minimum_common_size here, but explicitly mentioned the result type for the least-base case, proofing against different container types using different size types.

    Functions taking an output-iterator usually return iterator after all the iterators are done. This lets you start a new output run (even with a different output function) where you left off. It's not critical for the standard output iterators, since they're all pseudo-iterators. It is important when using a forward-iterator (or above) as an output iterator since they do track position. (Using a forward iterator as an output one is safe as long as the maximum number of transfers doesn't exceed the remaining iteration space.) Some functions put the output iterator at the end of the parameter list, others at the beginning; zip_width must use the latter since parameter packs have to go at the end.

    Moving to a suffix return type in zipWith makes every part of the function's signature fair game when computing the return type expression. It also lets me know right away if the computation can't be done due to incompatibilities at compile-time. The std::back_inserter function returns a special output-iterator to the vector that adds elements via the push_back member function.

    0 讨论(0)
提交回复
热议问题