Extract names of dataframe in list of dataframes and add it to columnnames

末鹿安然 提交于 2019-12-20 07:17:11

问题


I have a list of dataframes:

set.seed(23) 
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)

testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")

I want now to extract the name of each dataframe and add it to its columns. So the columnnames of the first dataframe in the list should be: Date_df1464, ABC_df1464, DEF_df1464, GHI_df1464 and JKL_df1464

I created this loop, but its not working:

for (a  in names(testlist)) {
  for(i in 1: length(testlist)){
    allcolnames = colnames(testlist[[i]])
    allcolnames = paste(allcolnames, a , sep = "_")
    testlist[[i]] = colnames(allcolnames)
  }
}

I get this error:

Error in testlist[[i]] : subscript out of bounds

I am pretty clueless why it doesnt work. Any ideas?


回答1:


Your solution was nearly right, you just do not need to loop two times. And your colnames call was the wrong way around. This should work:

for(i in 1: length(testlist)){
    allcolnames = colnames(testlist[[i]])
    allcolnames = paste(allcolnames, names(testlist)[i] , sep = "_")
    colnames(testlist[[i]]) = allcolnames
}

This also works, without any fors ;):

set.seed(23) 
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)

testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")

out <- lapply(names(testlist),function(name){
  dummy <- testlist[[name]]
  names(dummy) <- paste0(names(testlist[[name]]) ,'_',name)
  dummy
})
str(out)
#> List of 3
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...



回答2:


You could switch two Map in series; the inner Map prepares the new names, the outer Map applies it onto the sublists' names.

testlist <- Map(`names<-`, testlist,
                Map(paste, lapply(testlist, names), names(testlist), sep="_"))

Result

lapply(testlist, names)
# $df1464
# [1] "Date_df1464" "ABC_df1464"  "DEF_df1464"  "GHI_df1464"  "JKL_df1464" 
# 
# $df6355
# [1] "Date_df6355" "ABC_df6355"  "DEF_df6355"  "GHI_df6355"  "JKL_df6355" 
# 
# $df94566
# [1] "Date_df94566" "ABC_df94566"  "DEF_df94566"  "GHI_df94566"  "JKL_df94566" 



回答3:


Two ways to accomplish this. The better, more encapsulated way would be to use Map, looping over the individual data frames and their corresponding names:

new.testlist <- Map(function(df, name) {
  names(df) <- paste(names(df), name, sep = '_')
  return(df)
}, testlist, names(testlist))

> str(new.testlist)
List of 3
 $ df1464 :'data.frame':    30 obs. of  5 variables:
  ..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
 $ df6355 :'data.frame':    30 obs. of  5 variables:
  ..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
 $ df94566:'data.frame':    30 obs. of  5 variables:
  ..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...

The riskier way would be to use the super assignment operator to loop over the names, trusting that testlist remains reliable in your global environment. Note that this second method changes the column names in testlist as a side effect, and is generally NOT considered good practice. Max Teflon's answer is somewhat similar, in that it relies on testlist existing in the global environment, without passing it explicitly to the modifying function.

sapply(names(testlist), function(x) {
  names(testlist[[x]]) <<- paste(names(testlist[[x]]), x, sep = '_')
})


来源:https://stackoverflow.com/questions/56630991/extract-names-of-dataframe-in-list-of-dataframes-and-add-it-to-columnnames

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!