Adding a new column to each element in a list of tables or data frames

前端未结

关注

 6  1673

I have a list of files. I also have a list of \"names\" which I substr() from the actual filenames of these files. I would like to add a new column to each of t

相关标签:

6条回答

清酒与你

2020-11-27 03:59
A tricky way:
```
library(plyr)

names(filelist) <- ID
result <- ldply(filelist, data.frame)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

不知归路

2020-11-27 04:03

data_lst <- list(
  data_1 = data.frame(c1 = 1:3, c2 = 3:1),
  data_2 = data.frame(c1 = 1:3, c2 = 3:1)
)

f <- function (data, name){
  data$name <- name
  data
}

Map(f, data_lst , names(data_lst))

0 讨论(0)

一整个雨季

2020-11-27 04:10

The purrr way, using map2

library(dplyr)
library(purrr)

map2(filelist, ID, ~cbind(.x, SampleID = .y))

#[[1]]
#  x y SampleId
#1 1 a       1A
#2 2 b       1A
#3 3 c       1A

#[[2]]
#  x y SampleId
#1 4 d       IB
#2 5 e       IB
#3 6 f       IB

Or can also use

map2(filelist, ID, ~.x %>% mutate(SampleId = .y))

If you name the list, we can use imap and add the new column based on it's name.

names(filelist) <- c("1A","IB")
imap(filelist, ~cbind(.x, SampleID = .y))
#OR
#imap(filelist, ~.x %>% mutate(SampleId = .y))

which is similar to using Map

Map(cbind, filelist, SampleID = names(filelist))

0 讨论(0)

既然无缘

2020-11-27 04:12

This one worked for me:

Create a new column for every dataframe in a list; fill the values of the new column based on existing column. (In your case IDs).

Example:

# Create dummy data
df1<-data.frame(a = c(1,2,3))
df2<-data.frame(a = c(5,6,7))

# Create a list
l<-list(df1, df2)

> l
[[1]]
  a
1 1
2 2
3 3

[[2]]
  a
1 5
2 6
3 7

# add new column 'b'
# create 'b' values based on column 'a' 
l2<-lapply(l, function(x) 
  cbind(x, b = x$a*4))

Results in:

In your case something like:

filelist<-lapply(filelist, function(x) 
  cbind(x, b = x$SampleID))

0 讨论(0)

走了就别回头了

2020-11-27 04:17
This is a corrected version of your loop:
```
for( i in seq_along(filelist)){

  filelist[[i]]$SampleID <- rep(ID[i],nrow(filelist[[i]]))

}
```
There were 3 problems:
- A final ) was missing after the command in the body.
- Elements of lists are accessed by [[, not by [. [ returns a list of length one. [[ returns the element only.
- length(filelist) is just one value, so the loop runs for the last element of the list only. I replaced it with seq_along(filelist).
A more efficient approach is to use mapply for the task:
```
mapply(function(x, y) "[<-"(x, "SampleID", value = y) ,
       filelist, ID, SIMPLIFY = FALSE)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
孤独总比滥情好

2020-11-27 04:18
An alternate solution is to use cbind, and taking advantage of the fact that R will recylce values of a shorter vector.

For Example
```
x <- df2  # from above
cbind(x, NewColumn="Singleton")
 #    x y NewColumn
 #  1 4 d Singleton
 #  2 5 e Singleton
 #  3 6 f Singleton
```
There is no need for the use of rep. R does that for you.

Therfore, you could put cbind(filelist[[i]], ID[[i]]) in your for loop or as @Sven pointed out, you can use the cleaner mapply:
```
filelist <- mapply(cbind, filelist, "SampleID"=ID, SIMPLIFY=F)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...