how can I mutate in dplyr without losing order?

后端 未结 1 923
醉梦人生
醉梦人生 2021-01-18 03:46

Using data.table I can do the following:

library(data.table)
dt = data.table(a = 1:2, b = c(1,2,NA,NA))
#   a  b
#1: 1  1
#2: 2  2
#3: 1 NA
#4:          


        
相关标签:
1条回答
  • 2021-01-18 04:36

    In the current development version of dplyr (which will eventually become dplyr 0.2) the behaviour differs between data frames and data tables:

    library(dplyr)
    library(data.table)
    
    df <- data.frame(a = 1:2, b = c(1,2,NA,NA))
    dt <- data.table(df)
    
    df %.% group_by(a) %.% mutate(b = b[1])
    
    ## Source: local data frame [4 x 2]
    ## Groups: a
    ## 
    ##   a b
    ## 1 1 1
    ## 2 2 2
    ## 3 1 1
    ## 4 2 2
    
    dt %.% group_by(a) %.% mutate(b = b[1])
    
    ## Source: local data table [4 x 2]
    ## Groups: a
    ## 
    ##   a b
    ## 1 1 1
    ## 2 1 1
    ## 3 2 2
    ## 4 2 2
    

    This happens because group_by() applied to a data.table automatically does setkey() on the assumption that the index will make future operations faster.

    If there's a strong feeling that this is a bad default, I'm happy to change it.

    0 讨论(0)
提交回复
热议问题