How to split a data frame?

后端未结

关注

 8  2337

I want to split a data frame into several smaller ones. This looks like a very trivial question, however I cannot find a solution from web search.

相关标签:

8条回答

臣服心动

2020-11-22 03:13

You could also use

data2 <- data[data$sum_points == 2500, ]

This will make a dataframe with the values where sum_points = 2500

It gives :

airfoils sum_points field_points   init_t contour_t   field_t
...
491        5       2500         5625 0.000086  0.004272  6.321774
498        5       2500         5625 0.000087  0.004507  6.325083
504        5       2500         5625 0.000088  0.004370  6.336034
603        5        250        10000 0.000072  0.000525  1.111278
577        5        250        10000 0.000104  0.000559  1.111431
587        5        250        10000 0.000072  0.000528  1.111524
606        5        250        10000 0.000079  0.000538  1.111685
....
> data2 <- data[data$sum_points == 2500, ]
> data2
airfoils sum_points field_points   init_t contour_t   field_t
108        5       2500          625 0.000082  0.004329  0.733109
106        5       2500          625 0.000102  0.004564  0.733243
117        5       2500          625 0.000087  0.004321  0.733274
112        5       2500          625 0.000081  0.004428  0.733587

0 讨论(0)

故里飘歌

2020-11-22 03:16
subset() is also useful:
```
subset(DATAFRAME, COLUMNNAME == "")
```
For a survey package, maybe the survey package is pertinent?

http://faculty.washington.edu/tlumley/survey/
0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2020-11-22 03:21
If you want to split a dataframe according to values of some variable, I'd suggest using daply() from the plyr package.
```
library(plyr)
x <- daply(df, .(splitting_variable), function(x)return(x))
```
Now, x is an array of dataframes. To access one of the dataframes, you can index it with the name of the level of the splitting variable.
```
x$Level1
#or
x[["Level1"]]
```
I'd be sure that there aren't other more clever ways to deal with your data before splitting it up into many dataframes though.
0 讨论(0)
发布评论:

提交评论
- 加载中...
無奈伤痛

2020-11-22 03:23
If you want to split by values in one of the columns, you can use lapply. For instance, to split ChickWeight into a separate dataset for each chick:
```
data(ChickWeight)
lapply(unique(ChickWeight$Chick), function(x) ChickWeight[ChickWeight$Chick == x,])
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

感动是毒

2020-11-22 03:24

You may also want to cut the data frame into an arbitrary number of smaller dataframes. Here, we cut into two dataframes.

x = data.frame(num = 1:26, let = letters, LET = LETTERS)
set.seed(10)
split(x, sample(rep(1:2, 13)))

gives

$`1`
   num let LET
3    3   c   C
6    6   f   F
10  10   j   J
12  12   l   L
14  14   n   N
15  15   o   O
17  17   q   Q
18  18   r   R
20  20   t   T
21  21   u   U
22  22   v   V
23  23   w   W
26  26   z   Z

$`2`
   num let LET
1    1   a   A
2    2   b   B
4    4   d   D
5    5   e   E
7    7   g   G
8    8   h   H
9    9   i   I
11  11   k   K
13  13   m   M
16  16   p   P
19  19   s   S
24  24   x   X
25  25   y   Y

You can also split a data frame based upon an existing column. For example, to create three data frames based on the cyl column in mtcars:

split(mtcars,mtcars$cyl)

0 讨论(0)

余生分开走

2020-11-22 03:33
Splitting the data frame seems counter-productive. Instead, use the split-apply-combine paradigm, e.g., generate some data
```
df = data.frame(grp=sample(letters, 100, TRUE), x=rnorm(100))
```
then split only the relevant columns and apply the scale() function to x in each group, and combine the results (using split<- or ave)
```
df$z = 0
split(df$z, df$grp) = lapply(split(df$x, df$grp), scale)
## alternative: df$z = ave(df$x, df$grp, FUN=scale)
```
This will be very fast compared to splitting data.frames, and the result remains usable in downstream analysis without iteration. I think the dplyr syntax is
```
library(dplyr)
df %>% group_by(grp) %>% mutate(z=scale(x))
```
In general this dplyr solution is faster than splitting data frames but not as fast as split-apply-combine.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页