Same function over multiple data frames in R

前端未结

关注

 4  1315

I am new to R, and this is a very simple question. I\'ve found a lot of similar things to what I want but not exactly it. Basically I have multiple data frames and I simply

相关标签:

4条回答

野性不改

2020-11-29 03:36

Put them into a list and then run rowMeans over the list.

df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])

lapply(list(df1, df2), function(w) { w$Avg <- rowMeans(w[1:2]); w })

 [[1]]
   x y ID Avg
 1 3 1  a 2.0
 2 3 2  b 2.5
 3 3 3  c 3.0
 4 3 4  d 3.5
 5 3 5  e 4.0

 [[2]]
   x y ID Avg
 1 5 2  f 3.5
 2 5 3  g 4.0
 3 5 4  h 4.5
 4 5 5  i 5.0
 5 5 6  j 5.5

0 讨论(0)

梦如初夏

2020-11-29 03:48

Here's another possible solution using a for loop. I've had the same problem (with more datasets) a few days ago and other solutions did not work. Say you have n datasets :

df1 <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[24:26])
df2 <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[1:3])
...
dfn <- data.frame(start = seq(0,20,10), stop = seq(10,30,10), ID = letters[n:n+2])

The first thing to do is to make a list of the dfs:

df.list<-lapply(1:n, function(x) eval(parse(text=paste0("df", x)))) #In order to store all datasets in one list using their name
names(df.list)<-lapply(1:n, function(x) paste0("df", x)) #Adding the name of each df in case you want to unlist the list afterwards

Afterwards, you can use the for loop (that's the most important part):

for (i in 1:length(df.list)) {
  df.list[[i]][["Avg"]]<-rowMeans(df.list[[i]][1:2])
}

And you have (in the case your list only includes the two first datasets):

> df.list
[[1]]
  start stop ID Avg
1     0   10  x   5
2    10   20  y  15
3    20   30  z  25

[[2]]
  start stop ID Avg
1     0   10  a   5
2    10   20  b  15
3    20   30  c  25

Finally, if you want your modified datasets from the list back in the global environment, you can do:

list2env(df.list,.GlobalEnv)

This technique can be applied to n datasets and other functions. I find it to be the most flexible solution.

0 讨论(0)

醉话见心

2020-11-29 03:49

In case you want all the outputs in the same file this may help.

 df1 <- data.frame(x = rep(3, 5), y = seq(1, 5, 1), ID = letters[1:5])
 df2 <- data.frame(x = rep(5, 5), y = seq(2, 6, 1), ID = letters[6:10])

 z=list(df1,df2)
 df=NULL
 for (i in z) {
 i$Avg=(i$x+i$y)/2
 df<-rbind(df,i)
 print (df)
 }

 > df
   x y ID Avg
1  3 1  a 2.0
2  3 2  b 2.5
3  3 3  c 3.0
4  3 4  d 3.5
5  3 5  e 4.0
6  5 2  f 3.5
7  5 3  g 4.0
8  5 4  h 4.5
9  5 5  i 5.0
10 5 6  j 5.5

0 讨论(0)

醉酒成梦

2020-11-29 03:54

Make a list of data frames then use lapply to apply the function to them all.

df.list <- list(df1,df2,...)
res <- lapply(df.list, function(x) rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE))
# to keep the original data.frame also
res <- lapply(df.list, function(x) cbind(x,"rowmean"=rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE)))

The lapply will then feed in each data frame as x sequentially.

0 讨论(0)