Selecting a subset of columns in a data.table

前端 未结 4 1606
遇见更好的自我
遇见更好的自我 2020-12-01 09:15

I\'d like to print all the columns of a data table dt except one of them named V3 but don\'t want to refer to it by number but by name. This is the

相关标签:
4条回答
  • 2020-12-01 09:52

    From version 1.12.0 onwards, it is also possible to select columns using regular expressions on their names:

    iris_DT <- as.data.table(iris)
    
    iris_DT[, .SD, .SDcols = patterns(".e.al")]
    
    0 讨论(0)
  • 2020-12-01 09:53

    Edit 2019-09-27 with a more modern approach

    You can do this with patterns as mentioned above; or, you can do it with ! if there's a vector of names already:

    dt[ , !'V3']
    # or
    drop_cols = 'V3'
    dt[ , !..drop_cols]
    

    .. means "look up one level"


    Older version using with=FALSE (data.table is moving away from this argument steadily)

    Here's a way that uses grep to convert to numeric and allow negative column indexing:

    dt[, -grep("^V3$", names(dt)), with=FALSE]
    

    You did say "V3" was to be excluded, right?

    0 讨论(0)
  • 2020-12-01 09:55

    Use a very similar syntax as for a data.frame, but add the argument with=FALSE:

    dt[, setdiff(colnames(dt),"V9"), with=FALSE]
        V1 V2 V3 V4 V5 V6 V7 V8 V10
     1:  1  1  1  1  1  1  1  1   1
     2:  0  0  0  0  0  0  0  0   0
     3:  1  1  1  1  1  1  1  1   1
     4:  0  0  0  0  0  0  0  0   0
     5:  0  0  0  0  0  0  0  0   0
     6:  1  1  1  1  1  1  1  1   1
    

    The use of with=FALSE is nicely explained in the documentation for the j argument in ?data.table:

    j: A single column name, single expresson of column names, list() of expressions of column names, an expression or function call that evaluates to list (including data.frame and data.table which are lists, too), or (when with=FALSE) same as j in [.data.frame.


    From v1.10.2 onwards it is also possible to do this as follows:

    keep <- setdiff(names(dt), "V9")
    dt[, ..keep]
    

    Prefixing a symbol with .. will look up in calling scope (i.e. the Global Environment) and its value taken to be column names or numbers (source).

    0 讨论(0)
  • 2020-12-01 10:01

    Maybe it's only in recent versions of data.table (I'm using 1.9.6), but you can do:

    dt[, -'V3']
    

    For several columns:

    dt[, -c('V3', 'V9')]
    

    Note that the quotes around the variable names are necessary. Also, if your column names are stored in a variable, say cols, you'll need to do dt[, -cols, with=FALSE].

    0 讨论(0)
提交回复
热议问题