Adding a new column to each element in a list of tables or data frames

前端 未结 6 1595
没有蜡笔的小新
没有蜡笔的小新 2020-11-27 03:51

I have a list of files. I also have a list of \"names\" which I substr() from the actual filenames of these files. I would like to add a new column to each of t

相关标签:
6条回答
  • 2020-11-27 03:59

    A tricky way:

    library(plyr)
    
    names(filelist) <- ID
    result <- ldply(filelist, data.frame)
    
    0 讨论(0)
  • 2020-11-27 04:03
    data_lst <- list(
      data_1 = data.frame(c1 = 1:3, c2 = 3:1),
      data_2 = data.frame(c1 = 1:3, c2 = 3:1)
    )
    
    f <- function (data, name){
      data$name <- name
      data
    }
    
    Map(f, data_lst , names(data_lst)) 
    
    0 讨论(0)
  • 2020-11-27 04:10

    The purrr way, using map2

    library(dplyr)
    library(purrr)
    
    map2(filelist, ID, ~cbind(.x, SampleID = .y))
    
    #[[1]]
    #  x y SampleId
    #1 1 a       1A
    #2 2 b       1A
    #3 3 c       1A
    
    #[[2]]
    #  x y SampleId
    #1 4 d       IB
    #2 5 e       IB
    #3 6 f       IB
    

    Or can also use

    map2(filelist, ID, ~.x %>% mutate(SampleId = .y))
    

    If you name the list, we can use imap and add the new column based on it's name.

    names(filelist) <- c("1A","IB")
    imap(filelist, ~cbind(.x, SampleID = .y))
    #OR
    #imap(filelist, ~.x %>% mutate(SampleId = .y))
    

    which is similar to using Map

    Map(cbind, filelist, SampleID = names(filelist))
    
    0 讨论(0)
  • 2020-11-27 04:12

    This one worked for me:

    Create a new column for every dataframe in a list; fill the values of the new column based on existing column. (In your case IDs).

    Example:

    # Create dummy data
    df1<-data.frame(a = c(1,2,3))
    df2<-data.frame(a = c(5,6,7))
    
    # Create a list
    l<-list(df1, df2)
    
    > l
    [[1]]
      a
    1 1
    2 2
    3 3
    
    [[2]]
      a
    1 5
    2 6
    3 7
    
    # add new column 'b'
    # create 'b' values based on column 'a' 
    l2<-lapply(l, function(x) 
      cbind(x, b = x$a*4))
    

    Results in:

    > l2
    [[1]]
      a  b
    1 1  4
    2 2  8
    3 3 12
    
    [[2]]
      a  b
    1 5 20
    2 6 24
    3 7 28
    

    In your case something like:

    filelist<-lapply(filelist, function(x) 
      cbind(x, b = x$SampleID))
    
    0 讨论(0)
  • 2020-11-27 04:17

    This is a corrected version of your loop:

    for( i in seq_along(filelist)){
    
      filelist[[i]]$SampleID <- rep(ID[i],nrow(filelist[[i]]))
    
    }
    

    There were 3 problems:

    • A final ) was missing after the command in the body.
    • Elements of lists are accessed by [[, not by [. [ returns a list of length one. [[ returns the element only.
    • length(filelist) is just one value, so the loop runs for the last element of the list only. I replaced it with seq_along(filelist).

    A more efficient approach is to use mapply for the task:

    mapply(function(x, y) "[<-"(x, "SampleID", value = y) ,
           filelist, ID, SIMPLIFY = FALSE)
    
    0 讨论(0)
  • 2020-11-27 04:18

    An alternate solution is to use cbind, and taking advantage of the fact that R will recylce values of a shorter vector.

    For Example

    x <- df2  # from above
    cbind(x, NewColumn="Singleton")
     #    x y NewColumn
     #  1 4 d Singleton
     #  2 5 e Singleton
     #  3 6 f Singleton
    

    There is no need for the use of rep. R does that for you.

    Therfore, you could put cbind(filelist[[i]], ID[[i]]) in your for loop or as @Sven pointed out, you can use the cleaner mapply:

    filelist <- mapply(cbind, filelist, "SampleID"=ID, SIMPLIFY=F)
    
    0 讨论(0)
提交回复
热议问题