Working with dataframes in a list: Drop variables, add new ones

≯℡__Kan透↙ 提交于 2019-12-10 06:08:22

问题


Define a list dats with two dataframes, df1 and df2

dats <- list( df1 = data.frame(a=sample(1:3), b = sample(11:13)),
    df2 = data.frame(a=sample(1:3), b = sample(11:13)))

> dats
$df1
  a  b
1 2 12
2 3 11
3 1 13

$df2
  a  b
1 3 13
2 2 11
3 1 12

I would like to drop variable a in each data frame. Next I would like to add a variable with the id of each dataframe from an external dataframe, like:

ids <- data.frame(id=c("id1","id2"),df=c("df1","df2"))
> ids
  id  df
1 id1 df1
2 id2 df2

To drop unnecessary vars I tried this without luck:

> dats <- lapply(dats, function(x) assign(x, x[,c("b")]))  
> Error in assign(x, x[, c("b")]) : invalid first argument

Not sure how to add the id either.

I also tried, perhaps more appropriately:

> temp <- lapply(dats, function(x) subset(x[1], select=x[[1]]$b))
Error in x[[1]]$b : $ operator is invalid for atomic vectors

What I find confusing is that str(out[1]) returns a list, str(out[[1]]) returns a dataframe. I think that may have something to do with it.


回答1:


Or try this: Extract your ids into a named vector that maps the data-frame name to the id:

df2id <- ids$id
names(df2id) <- ids$df

> df2id
df1 df2 
id1 id2 
Levels: id1 id2

Then use mapply to both (a) drop the a column from each data-frame, and (b) add the id column:

> mapply( function(d,x) cbind( subset(d, select = -a),
+                              id = x),
+         dats, df2id[ names(dats) ] ,
+         SIMPLIFY=FALSE)
$df1
   b  id
1 12 id1
2 11 id1
3 13 id1

$df2
   b  id
1 12 id2
2 11 id2
3 13 id2

Note that we are passing df2id[ names(dats) ] to the mapply -- this ensures that the data-frames in df2id are "aligned" with the data-frames in dats.




回答2:


Is this OK?

dats <- list( df1 = data.frame(a=sample(1:3), b = sample(11:13)),
    df2 = data.frame(a=sample(1:3), b = sample(11:13)))

ids <- data.frame(id=c("id1","id2"),df=c("df1","df2"))

# remove variable a
dats2 <- lapply(dats, function(x) x[,!names(x) == "a"])

# add id
for(i in 1:length(dats2)) {
  dats2[[i]] <- merge(dats2[[i]], ids$id[ids$df == names(dats2)[i]])
}

dats2

  $df1
     x   y
  1 11 id1
  2 12 id1
  3 13 id1

  $df2
     x   y
  1 11 id2
  2 12 id2
  3 13 id2


来源:https://stackoverflow.com/questions/6399011/working-with-dataframes-in-a-list-drop-variables-add-new-ones

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!