问题
I'm trying to Run a simulation but I'm having trouble storing multiple dataframes called "data_i" in a list ordering by i. I start with a df called "data_", which has data from 1901 to 2032 (132 rows). I apply a loop to create one dataframe per row called data_1, data_2,data_3,...,data_132 (row of 2032 is stored in data_132). Finally, I store all this dataframes in a list and use lapply to create a column in each dataframe. Here is a reproducible example:
#Main dataframe
time <- 1901:2032
b <- 1:132
data_ <- data.frame(time,b)
#Loop for creating data_i where i goes from 1 to 132
simulations <- 10000
for (i in 1:132) {
assign(paste("data_",i, sep = ""), as.data.frame( sapply(data_[i,], function(n) rep(n,simulations)), stringsAsFactors = FALSE ))
}
#Store all dataframes in list (**I THINK THE PROBLEM IS HERE**)
data_names<-str_extract(ls(), '^data_[[:digit:]]{1,3}$')[!is.na(str_extract(ls(), '^data_[[:digit:]]{1,3}$'))]
dataframes<-lapply(data_names, function(x)get(x))
#Create a new column in each dataframe
new_list <- lapply(dataframes, function(x) cbind(x, production = as.numeric(runif(simulations, min = 50, max = 100))))
#Create data_newi in environnment
list2env(setNames(new_list,paste0("data_new", seq_along(dataframes))),
envir = parent.frame())
The code runs but the problem is that the order of the dataframes is not data_1, data_2,data_3,...,data_132 but data_1,data_10,data_100,data_101...This generates that data_names stores this values in that order. This will lead to, for example, 2032 not being in data_new132 as I would want it to be.
Does anybody knows how to solve this? Thanks in advance!
回答1:
Andres, See if this helps. I added a pad of '0' for the max number of characters (e.g. 132 = 3 characters wide):
#Main dataframe
time <- 1901:2032
b <- 1:132
data_ <- data.frame(time,b)
#Loop for creating data_i where i goes from 1 to 132
simulations <- 10000
for (i in 1:132) {
assign(paste("data_",str_pad(i,nchar(max(b)),pad="0"), sep = ""), as.data.frame( sapply(data_[i,], function(n) rep(n,simulations)), stringsAsFactors = FALSE ))
}
#Store all dataframes in list (**I THINK THE PROBLEM IS HERE**)
data_names<-str_extract(ls(), '^data_[[:digit:]]{1,3}$')[!is.na(str_extract(ls(), '^data_[[:digit:]]{1,3}$'))]
dataframes<-lapply(data_names, function(x)get(x))
#Create a new column in each dataframe
new_list <- lapply(dataframes, function(x) cbind(x, production = as.numeric(runif(simulations, min = 50, max = 100))))
#Create data_newi in environnment
list2env(setNames(new_list,paste0("data_new", paste(str_pad(seq_along(dataframes),nchar(max(seq_along(dataframes))),pad="0"),sep=""))),
envir = parent.frame())
回答2:
1) Use mixedsort
in gtools:
library(gtools)
for(i in c(2, 10)) assign(paste0("data", i), i)
ls(pattern = "^data")
## [1] "data10" "data2"
mixedsort(ls(pattern = "^data"))
## [1] "data2" "data10"
2) or ensure that the names are the same length using leading 0's in which case ls() will sort them appropriately:
for(i in c(2, 10)) assign(sprintf("data%03d", i), i)
ls(pattern = "^data")
## [1] "data002" "data010"
3) Normally one does not assign such objects directly into the global environment but puts them into a list. One can refer to elements using L[[1]]
, etc.
L <- list()
# for(i in 1:3) L[[i]] <- i
L
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
3a) or in one line:
L <- lapply(1:3, function(i) i)
来源:https://stackoverflow.com/questions/66105619/how-to-order-multiple-dataframes-in-global-environment-r