问题
I have a list of dataframes:
set.seed(23)
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)
testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")
I want now to extract the name of each dataframe and add it to its columns. So the columnnames of the first dataframe in the list should be: Date_df1464, ABC_df1464, DEF_df1464, GHI_df1464 and JKL_df1464
I created this loop, but its not working:
for (a in names(testlist)) {
for(i in 1: length(testlist)){
allcolnames = colnames(testlist[[i]])
allcolnames = paste(allcolnames, a , sep = "_")
testlist[[i]] = colnames(allcolnames)
}
}
I get this error:
Error in testlist[[i]] : subscript out of bounds
I am pretty clueless why it doesnt work. Any ideas?
回答1:
Your solution was nearly right, you just do not need to loop two times.
And your colnames
call was the wrong way around.
This should work:
for(i in 1: length(testlist)){
allcolnames = colnames(testlist[[i]])
allcolnames = paste(allcolnames, names(testlist)[i] , sep = "_")
colnames(testlist[[i]]) = allcolnames
}
This also works, without any fors ;):
set.seed(23)
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)
testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")
out <- lapply(names(testlist),function(name){
dummy <- testlist[[name]]
names(dummy) <- paste0(names(testlist[[name]]) ,'_',name)
dummy
})
str(out)
#> List of 3
#> $ :'data.frame': 30 obs. of 5 variables:
#> ..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#> ..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#> ..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
#> ..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#> $ :'data.frame': 30 obs. of 5 variables:
#> ..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#> ..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#> ..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
#> ..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#> $ :'data.frame': 30 obs. of 5 variables:
#> ..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#> ..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#> ..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
#> ..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
回答2:
You could switch two Map
in series; the inner Map
prepares the new names, the outer Map
applies it onto the sublists' names.
testlist <- Map(`names<-`, testlist,
Map(paste, lapply(testlist, names), names(testlist), sep="_"))
Result
lapply(testlist, names)
# $df1464
# [1] "Date_df1464" "ABC_df1464" "DEF_df1464" "GHI_df1464" "JKL_df1464"
#
# $df6355
# [1] "Date_df6355" "ABC_df6355" "DEF_df6355" "GHI_df6355" "JKL_df6355"
#
# $df94566
# [1] "Date_df94566" "ABC_df94566" "DEF_df94566" "GHI_df94566" "JKL_df94566"
回答3:
Two ways to accomplish this. The better, more encapsulated way would be to use Map
, looping over the individual data frames and their corresponding names:
new.testlist <- Map(function(df, name) {
names(df) <- paste(names(df), name, sep = '_')
return(df)
}, testlist, names(testlist))
> str(new.testlist)
List of 3
$ df1464 :'data.frame': 30 obs. of 5 variables:
..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
$ df6355 :'data.frame': 30 obs. of 5 variables:
..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
$ df94566:'data.frame': 30 obs. of 5 variables:
..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
The riskier way would be to use the super assignment operator to loop over the names, trusting that testlist
remains reliable in your global environment. Note that this second method changes the column names in testlist
as a side effect, and is generally NOT considered good practice. Max Teflon's answer is somewhat similar, in that it relies on testlist
existing in the global environment, without passing it explicitly to the modifying function.
sapply(names(testlist), function(x) {
names(testlist[[x]]) <<- paste(names(testlist[[x]]), x, sep = '_')
})
来源:https://stackoverflow.com/questions/56630991/extract-names-of-dataframe-in-list-of-dataframes-and-add-it-to-columnnames