Is there a way to define a subsequent set of data.frame in R?

问题

If I have a data.frame like this:

My goal is to define a set of data.frame as:

y1<-data[1:2,]
y2<-data[3:4,]
y3<-data[5:6,] ##...etc. by a loop.

Therefore, ideally I would like to use (for instance) a for loop

for (i in 1:5){
    y_i <- data[2*i:2*(i+1), ]
}

However, I cannot figure out how to define a subsequent set of data.frame such as y_i. Is there any method able to do this? Thanks in advance.

回答1:

You can use assign. It will help you get the data frames you need with the naming convention you asked for.

for (i in 1:5){
    assign(paste("y", i, sep="_"), data[(i*2-1):(i*2), ])
}

回答2:

Use a list for y and generate a sequence for the indexing:

y <- lapply(seq(from=1, to=nrow(dat), by=2), function(i) {
  dat[i:(i+1),]
})

str(y)

## List of 5
##  $ :'data.frame': 2 obs. of  2 variables:
##   ..$ X1: int [1:2] 1 2
##   ..$ X2: chr [1:2] "A" "A"
##  $ :'data.frame': 2 obs. of  2 variables:
##   ..$ X1: int [1:2] 3 4
##   ..$ X2: chr [1:2] "B" "B"
##  $ :'data.frame': 2 obs. of  2 variables:
##   ..$ X1: int [1:2] 5 6
##   ..$ X2: chr [1:2] "A" "A"
##  $ :'data.frame': 2 obs. of  2 variables:
##   ..$ X1: int [1:2] 7 8
##   ..$ X2: chr [1:2] "B" "B"
##  $ :'data.frame': 2 obs. of  2 variables:
##   ..$ X1: int [1:2] 9 10
##   ..$ X2: chr [1:2] "A" "A"

回答3:

If this is based on the adjacent values that are same on the second column

 lst <- split(df,with(df,cumsum(c(TRUE,X2[-1]!=X2[-nrow(df)]))))

If you need individual data.frame objects

 list2env(setNames(lst, paste0('y', seq_along(lst))), envir=.GlobalEnv)
 #<environment: R_GlobalEnv>

 y1
 # X1 X2
 #1  1  A
 #2  2  A

Or if it is only based on a fixed number 2

 split(df,as.numeric(gl(nrow(df),2, nrow(df))))

data

df <- structure(list(X1 = 1:10, X2 = c("A", "A", "B", "B", "A", "A", 
"B", "B", "A", "A")), .Names = c("X1", "X2"), class = "data.frame",
 row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))

回答4:

data <- data.frame(X1 = c(1:10), X2 = c("A", "A", "B", "B", "A", "A", "B", "B", "A", "A"))
lapply(1:5, function (i) assign(paste("y", i, sep="_"), data[2*i-1:2*i, ], envir=.GlobalEnv))

This would also work. As 'Cancer' said, assign can be helpful in this situation. I just change for loop to lapply function.

来源：https://stackoverflow.com/questions/27652050/is-there-a-way-to-define-a-subsequent-set-of-data-frame-in-r

标签

dataframe

operation