constructing a Data Frame in Rcpp

前端 未结 4 1718
名媛妹妹
名媛妹妹 2020-12-14 23:20

I want to construct a data frame in an Rcpp function, but when I get it, it doesn\'t really look like a data frame. I\'ve tried pushing vectors etc. but it leads to the same

相关标签:
4条回答
  • 2020-12-14 23:49

    I concur with joran. The output of a C function called from within R is a list of all its arguments, both "in" and "out", so each "column" of the dataframe could be represented in the C function call as an argument. Once the result of the C function call is in R, all that remains to be done is to extract those list elements using list indexing and give them the appropriate names.

    0 讨论(0)
  • 2020-12-15 00:01

    Using the information from @baptiste's answer, this is what finally does give a well formed data frame:

    RcppExport SEXP makeDataFrame(SEXP in) {
        Rcpp::DataFrame dfin(in);
        Rcpp::DataFrame dfout;
        Rcpp::CharacterVector namevec;
        std::string namestem = "Column Heading ";
        for (int i=0;i<2;i++) {
            dfout.push_back(dfin(i));
            namevec.push_back(namestem+std::string(1,(char)(((int)'a') + i)));
        }
        dfout.attr("names") = namevec;
        Rcpp::DataFrame x;
        Rcpp::Language call("as.data.frame",dfout);
        x = call.eval();
        return x;
    }
    

    I think the point remains that this might be inefficient due to push_back (as suggested by @Dirk) and the second Language call evaluation. I looked up the rcpp unitTests, and haven't been able to come up with something better yet. Anybody have any ideas?

    Update:

    Using @Dirk's suggestions (thanks!), this seems to be a simpler, efficient solution:

    RcppExport SEXP makeDataFrame(SEXP in) {
        Rcpp::DataFrame dfin(in);
        Rcpp::List myList(dfin.length());
        Rcpp::CharacterVector namevec;
        std::string namestem = "Column Heading ";
        for (int i=0;i<dfin.length();i++) {
            myList[i] = dfin(i); // adding vectors
            namevec.push_back(namestem+std::string(1,(char)(((int)'a') + i))); // making up column names
        }
        myList.attr("names") = namevec;
        Rcpp::DataFrame dfout(myList);
        return dfout;
    }
    
    0 讨论(0)
  • 2020-12-15 00:08

    It seems Rcpp can return a proper data.frame, provided you supply the names explicitely. I'm not sure how to adapt this to your example with arbitrary names

    mkdf <- '
        Rcpp::DataFrame dfin(input);
        Rcpp::DataFrame dfout;
        for (int i=0;i<dfin.length();i++) {
            dfout.push_back(dfin(i));
        }
    
        return Rcpp::DataFrame::create( Named("x")= dfout(1), Named("y") = dfout(2));
    '
    library(inline)
    test <- cxxfunction( signature(input="data.frame"),
                                  mkdf, plugin="Rcpp")
    
    test(input=head(iris))
    
    0 讨论(0)
  • 2020-12-15 00:15

    Briefly:

    • DataFrames are indeed just like lists with the added restriction of having to have a common length, so they are best constructed column by column.

    • The best way is often to look for our unit tests. Her inst/unitTests/runit.DataFrame.R regroups tests for the DataFrame class.

    • You also found the .push_back() member function in Rcpp which we added for convenience and analogy with the STL. We do warn that it is not recommended: due to differences with the way R objects are constructed, we essentially always need to do full copies .push_back is not very efficient.

    • Despite me answering here frequently, the rcpp-devel list a better place for Rcpp questions.

    0 讨论(0)
提交回复
热议问题