Multiple data.frames from one with almost random selection criteria

感情迁移 提交于 2019-12-25 05:15:46

问题


This is a follow-up question from Extract multiple data.frames from one with selection criteria.

Let's say the data is the same as in above example

df <- data.frame(x1 = runif(1000), x2 = runif(1000), x3 = runif(1000), 
             split = sample( c('SPLITMEHERE', 'OBS'), 1000, replace=TRUE, prob=c(0.1, 0.9) ))

Basically, I need more general solution than the one in the quoted example.

Namely, some counties in some months (every month is a .txt file) have only 1 table, and therefore only one 'SPLITMEHERE'.

For example, fourth county consists of only one table, which means this county's group should end at the first SPLITMEHERE, and not the second. Also, the last county consists of three tables, but this doesn't matter as it is at the end so I can easily merge the last group.

The problem is, it's not always the fourth county the one which consists of one table, sometimes there are others.

Let's say we have this from the above dataset:

              x1          x2          x3       split
1    0.591940061 0.635445182 0.304498259         SPLITMEHERE
2    0.510158838 0.170956885 0.881018211         OBS
3    0.938369076 0.495642515 0.171227120         OBS
4    0.366153042 0.464698494 0.550931566         OBS
5    0.051998873 0.222881187 0.934175135         OBS
6    0.706940809 0.735885367 0.666272118         SPLITMEHERE
7    0.244219533 0.340480033 0.144009797         OBS
8    0.546891246 0.024010211 0.151338479         OBS
9    0.032659978 0.174774606 0.576820824         OBS
10   0.641988559 0.575596526 0.911188682         OBS
11   0.111024861 0.969227957 0.643551420         OBS
12   0.179469011 0.052698538 0.199299193         OBS
13   0.199203707 0.429210222 0.525920379         SPLITMEHERE
14   0.837223042 0.556442838 0.881305105         OBS
15   0.628854814 0.874139058 0.199226364         OBS
16   0.618989684 0.784011205 0.038021599         OBS
17   0.421893407 0.394786134 0.519100402         OBS
18   0.126453054 0.926114653 0.687669218         OBS
19   0.739393898 0.938428464 0.110824400         OBS
20   0.582882966 0.198520021 0.942501112         OBS
21   0.143852453 0.963329219 0.993098109         OBS
22   0.249366828 0.242881240 0.486960755         OBS
23   0.060602695 0.797436479 0.432171847         SPLITMEHERE
24   0.013947914 0.028245990 0.489656647         OBS
25   0.795170730 0.541771474 0.122952446         OBS
26   0.786673408 0.284252650 0.305914856         OBS
27   0.591369056 0.321041728 0.285482027         OBS
28   0.899577535 0.468031873 0.588038383         SPLITMEHERE
29   0.955853329 0.552076328 0.825239050         OBS
30   0.634738808 0.050917396 0.730090024         OBS

Let's say there are three counties in the printed output, and I want three data frames as follows:

    df1
1    0.510158838 0.170956885 0.881018211         OBS
2    0.938369076 0.495642515 0.171227120         OBS
3    0.366153042 0.464698494 0.550931566         OBS
4    0.051998873 0.222881187 0.934175135         OBS
5    0.244219533 0.340480033 0.144009797         OBS
6    0.546891246 0.024010211 0.151338479         OBS
7    0.032659978 0.174774606 0.576820824         OBS
8    0.641988559 0.575596526 0.911188682         OBS
9    0.111024861 0.969227957 0.643551420         OBS
10   0.179469011 0.052698538 0.199299193         OBS

df2

1   0.837223042 0.556442838 0.881305105         OBS
2   0.628854814 0.874139058 0.199226364         OBS
3   0.618989684 0.784011205 0.038021599         OBS
4   0.421893407 0.394786134 0.519100402         OBS
5   0.126453054 0.926114653 0.687669218         OBS
6   0.739393898 0.938428464 0.110824400         OBS
7   0.582882966 0.198520021 0.942501112         OBS
8   0.143852453 0.963329219 0.993098109         OBS
9   0.249366828 0.242881240 0.486960755         OBS

df3


1   0.013947914 0.028245990 0.489656647         OBS
2   0.795170730 0.541771474 0.122952446         OBS
3   0.786673408 0.284252650 0.305914856         OBS
4   0.591369056 0.321041728 0.285482027         OBS
5   0.955853329 0.552076328 0.825239050         OBS
6   0.634738808 0.050917396 0.730090024         OBS

Any ideas?

来源:https://stackoverflow.com/questions/44021913/multiple-data-frames-from-one-with-almost-random-selection-criteria

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!