Pivoting rows into columns

后端 未结 4 1577
我寻月下人不归
我寻月下人不归 2021-02-05 15:57

Suppose (to simplify) I have a table containing some control vs. treatment data:

Which, Color, Response, Count
Control, Red, 2, 10
Control, Blue, 3, 20
Treatment         


        
4条回答
  •  被撕碎了的回忆
    2021-02-05 16:18

    Reshape does indeed work for pivoting a skinny data frame (e.g., from a simple SQL query) to a wide matrix, and is very flexible, but it's slow. For large amounts of data, very very slow. Fortunately, if you only want to pivot to a fixed shape, it's fairly easy to write a little C function to do the pivot fast.

    In my case, pivoting a skinny data frame with 3 columns and 672,338 rows took 34 seconds with reshape, 25 seconds with my R code, and 2.3 seconds with C. Ironically, the C implementation was probably easier to write than my (tuned for speed) R implementation.

    Here's the core C code for pivoting floating point numbers. Note that it assumes that you have already allocated a correctly sized result matrix in R before calling the C code, which causes the R-devel folks to shudder in horror:

    #include  
    #include  
    /* 
     * This mutates the result matrix in place.
     */
    SEXP
    dtk_pivot_skinny_to_wide(SEXP n_row  ,SEXP vi_1  ,SEXP vi_2  ,SEXP v_3  ,SEXP result)
    {
       int ii, max_i;
       unsigned int pos;
       int nr = *INTEGER(n_row);
       int * aa = INTEGER(vi_1);
       int * bb = INTEGER(vi_2);
       double * cc = REAL(v_3);
       double * rr = REAL(result);
       max_i = length(vi_2);
       /*
        * R stores matrices by column.  Do ugly pointer-like arithmetic to
        * map the matrix to a flat vector.  We are translating this R code:
        *    for (ii in 1:length(vi.2))
        *       result[((n.row * (vi.2[ii] -1)) + vi.1[ii])] <- v.3[ii]
        */
       for (ii = 0; ii < max_i; ++ii) {
          pos = ((nr * (bb[ii] -1)) + aa[ii] -1);
          rr[pos] = cc[ii];
          /* printf("ii: %d \t value: %g \t result index:  %d \t new value: %g\n", ii, cc[ii], pos, rr[pos]); */
       }
       return(result);
    }
    

提交回复
热议问题