Same function over multiple data frames in R

前端 未结 4 1268
南笙
南笙 2020-11-29 03:08

I am new to R, and this is a very simple question. I\'ve found a lot of similar things to what I want but not exactly it. Basically I have multiple data frames and I simply

相关标签:
4条回答
  • 2020-11-29 03:36

    Put them into a list and then run rowMeans over the list.

    df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
    df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])
    
    lapply(list(df1, df2), function(w) { w$Avg <- rowMeans(w[1:2]); w })
    
     [[1]]
       x y ID Avg
     1 3 1  a 2.0
     2 3 2  b 2.5
     3 3 3  c 3.0
     4 3 4  d 3.5
     5 3 5  e 4.0
    
     [[2]]
       x y ID Avg
     1 5 2  f 3.5
     2 5 3  g 4.0
     3 5 4  h 4.5
     4 5 5  i 5.0
     5 5 6  j 5.5
    
    0 讨论(0)
  • 2020-11-29 03:48

    Here's another possible solution using a for loop. I've had the same problem (with more datasets) a few days ago and other solutions did not work. Say you have n datasets :

    df1 <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[24:26])
    df2 <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[1:3])
    ...
    dfn <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[n:n+2])
    

    The first thing to do is to make a list of the dfs:

    df.list<-lapply(1:n, function(x) eval(parse(text=paste0("df", x)))) #In order to store all datasets in one list using their name
    names(df.list)<-lapply(1:n, function(x) paste0("df", x)) #Adding the name of each df in case you want to unlist the list afterwards
    

    Afterwards, you can use the for loop (that's the most important part):

    for (i in 1:length(df.list)) {
      df.list[[i]][["Avg"]]<-rowMeans(df.list[[i]][1:2])
    }
    

    And you have (in the case your list only includes the two first datasets):

    > df.list
    [[1]]
      start stop ID Avg
    1     0   10  x   5
    2    10   20  y  15
    3    20   30  z  25
    
    [[2]]
      start stop ID Avg
    1     0   10  a   5
    2    10   20  b  15
    3    20   30  c  25
    

    Finally, if you want your modified datasets from the list back in the global environment, you can do:

    list2env(df.list,.GlobalEnv)
    

    This technique can be applied to n datasets and other functions. I find it to be the most flexible solution.

    0 讨论(0)
  • 2020-11-29 03:49

    In case you want all the outputs in the same file this may help.

     df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
     df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])
    
     z=list(df1,df2)
     df=NULL
     for (i in z) {
     i$Avg=(i$x+i$y)/2
     df<-rbind(df,i)
     print (df)
     }
    
     > df
       x y ID Avg
    1  3 1  a 2.0
    2  3 2  b 2.5
    3  3 3  c 3.0
    4  3 4  d 3.5
    5  3 5  e 4.0
    6  5 2  f 3.5
    7  5 3  g 4.0
    8  5 4  h 4.5
    9  5 5  i 5.0
    10 5 6  j 5.5
    
    0 讨论(0)
  • 2020-11-29 03:54

    Make a list of data frames then use lapply to apply the function to them all.

    df.list <- list(df1,df2,...)
    res <- lapply(df.list, function(x) rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE))
    # to keep the original data.frame also
    res <- lapply(df.list, function(x) cbind(x,"rowmean"=rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE)))
    

    The lapply will then feed in each data frame as x sequentially.

    0 讨论(0)
提交回复
热议问题