问题
I've been practicing and learning wrangling R data frames with columns that contain lubridate
data types, such as an example problem in my other question.
Now, I am trying to do the equivalent of joining two data frames, but joining them by whether one timestamp in one data frame falls within an interval
in the other data frame. For example:
This is df1
:
> glimpse(df1)
Observations: 6,160
Variables: 4
$ upload_id <int> 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, ...
$ site_id <int> 2, 2, 2, 2, 2, 4, 4, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, ...
$ segment_id <int> 1, 2, 3, 4, 5, 1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, ...
$ interval <S4: Interval> 2015-04-12 UTC--2015-04-19 UTC, 2015-04-19 UTC--201...
Where there is a bunch of lubridate
time intervals each with a corresponding unique combination of upload_id
, site_id
, and segment_id
.
And this is df2
:
> glimpse(df2)
Observations: 32,385
Variables: 3
$ sequence_id <int> 2047, 2067, 2069, 2072, 2075, 2081, 2086, 2091, 2096, 2104,...
$ upload_id <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 5, 5,...
$ taken <dttm> 2015-04-11 23:09:59, 2015-04-15 19:17:10, 2015-04-16 07:42...
Where there is a series of timestamps in column taken
with corresponding unique combinations of sequence_id
and upload_id
.
Essentially, I want to left_join(df2, df1)
where the needed by
argument considers two things: (1) the shared upload_id
column; and (2) whether taken
in df2
falls within interval
in df1
. This is because for any given taken
, it might fall %within%
multiple interval
s, and vice versa, so I want to use upload_id
as a unique identifier for each taken
so that each taken
in df2
will be matched to only one other row in df1
. After the join operation, I expect the new data frame to have six columns: sequence_id
, taken
, upload_id
, site_id
, segment_id
, and interval
. How can this be done tidyly?
EDIT: A comment suggested that uploading .Rdata files may be untrustworthy and another stated that it's against the policy here. So I removed the .Rdata files, and I tried to take a 300-row subset of each data frame via dput()
, here is df1
:
structure(list(upload_id = c(1050L, 1582L, 2336L, 2665L, 1007L,
2148L, 275L, 2738L, 1501L, 64L, 2737L, 1547L, 2146L, 2596L, 457L,
2141L, 2790L, 362L, 2835L, 2741L, 575L, 914L, 2820L, 2572L, 2791L,
2157L, 1117L, 1535L, 2738L, 794L, 1335L, 2737L, 2570L, 1597L,
300L, 460L, 1701L, 2142L, 274L, 339L, 2109L, 500L, 2184L, 2837L,
1238L, 2837L, 2727L, 1175L, 1524L, 303L, 1714L, 1412L, 1894L,
340L, 1495L, 869L, 995L, 2438L, 1974L, 2762L, 205L, 1581L, 1527L,
2818L, 1617L, 2537L, 1956L, 638L, 1808L, 2151L, 771L, 2709L,
2185L, 2015L, 2511L, 1163L, 2557L, 1377L, 2213L, 2560L, 1417L,
1934L, 1860L, 2772L, 2614L, 2698L, 421L, 2609L, 1418L, 2355L,
463L, 2697L, 347L, 1531L, 1427L, 2548L, 2218L, 2781L, 1962L,
396L, 234L, 2846L, 4L, 2742L, 2838L, 1676L, 1635L, 2810L, 1990L,
2514L, 2809L, 1354L, 2668L, 2737L, 1606L, 764L, 1176L, 1442L,
519L, 2584L, 1021L, 352L, 2314L, 2662L, 1368L, 1043L, 2207L,
2792L, 684L, 1806L, 2743L, 2557L, 1971L, 1510L, 418L, 1866L,
1569L, 1717L, 1992L, 1629L, 2189L, 316L, 2030L, 2840L, 2307L,
1506L, 1962L, 1249L, 2791L, 670L, 592L, 236L, 2781L, 793L, 2790L,
2640L, 2517L, 855L, 626L, 1303L, 2241L, 1541L, 910L, 155L, 1617L,
29L, 916L, 732L, 2006L, 2742L, 2788L, 2830L, 2664L, 1455L, 1062L,
937L, 1543L, 781L, 737L, 901L, 2633L, 194L, 1000L, 1170L, 1567L,
2826L, 73L, 801L, 970L, 1327L, 2688L, 1538L, 2306L, 2170L, 1977L,
2367L, 186L, 1990L, 2606L, 2000L, 2818L, 396L, 696L, 630L, 2835L,
2067L, 1540L, 51L, 511L, 2587L, 2737L, 1961L, 594L, 1867L, 1042L,
116L, 1532L, 760L, 2662L, 2814L, 2585L, 2596L, 2837L, 1870L,
1971L, 73L, 2595L, 1955L, 692L, 2062L, 2742L, 2084L, 1098L, 2205L,
1404L, 2627L, 809L, 2684L, 2570L, 322L, 2605L, 2016L, 2782L,
54L, 2254L, 1165L, 655L, 532L, 732L, 534L, 2664L, 1880L, 1444L,
1920L, 477L, 2728L, 2640L, 1434L, 100L, 2587L, 1545L, 250L, 282L,
1756L, 940L, 2826L, 1005L, 2835L, 2152L, 203L, 1970L, 579L, 1234L,
2682L, 1050L, 2594L, 199L, 945L, 758L, 1262L, 796L, 2156L, 921L,
1961L, 817L, 486L, 982L, 394L, 1928L, 2237L, 2570L, 2144L, 2386L,
325L, 2729L, 2685L, 901L, 2042L, 141L, 2248L), site_id = c(184L,
278L, 73L, 364L, 231L, 244L, 72L, 364L, 74L, 52L, 350L, 248L,
223L, 306L, 117L, 223L, 350L, 115L, 357L, 295L, 113L, 74L, 350L,
348L, 364L, 267L, 74L, 248L, 364L, 198L, 73L, 350L, 347L, 260L,
103L, 134L, 271L, 223L, 72L, 120L, 73L, 145L, 214L, 350L, 74L,
350L, 361L, 227L, 160L, 73L, 73L, 237L, 292L, 110L, 267L, 205L,
230L, 74L, 306L, 295L, 47L, 261L, 44L, 357L, 280L, 355L, 199L,
119L, 160L, 73L, 186L, 348L, 214L, 295L, 348L, 160L, 306L, 74L,
191L, 350L, 73L, 191L, 191L, 364L, 306L, 364L, 74L, 73L, 74L,
74L, 155L, 350L, 54L, 248L, 260L, 114L, 241L, 360L, 292L, 31L,
36L, 73L, 7L, 360L, 364L, 74L, 262L, 361L, 292L, 350L, 360L,
256L, 73L, 350L, 280L, 184L, 44L, 258L, 146L, 347L, 217L, 44L,
113L, 357L, 191L, 233L, 245L, 360L, 156L, 293L, 360L, 306L, 292L,
226L, 74L, 36L, 73L, 73L, 199L, 244L, 241L, 110L, 295L, 361L,
248L, 251L, 292L, 113L, 364L, 74L, 160L, 105L, 360L, 202L, 350L,
306L, 351L, 201L, 160L, 247L, 320L, 248L, 213L, 54L, 280L, 41L,
198L, 187L, 74L, 360L, 357L, 287L, 350L, 44L, 234L, 105L, 248L,
200L, 174L, 198L, 73L, 54L, 217L, 236L, 277L, 361L, 63L, 194L,
160L, 73L, 361L, 248L, 320L, 74L, 293L, 73L, 68L, 292L, 350L,
199L, 357L, 31L, 166L, 165L, 357L, 312L, 248L, 42L, 148L, 350L,
350L, 147L, 116L, 248L, 174L, 47L, 226L, 74L, 357L, 73L, 348L,
306L, 350L, 293L, 292L, 63L, 348L, 298L, 174L, 316L, 360L, 312L,
227L, 319L, 237L, 350L, 160L, 348L, 347L, 108L, 306L, 293L, 361L,
54L, 74L, 74L, 73L, 56L, 187L, 74L, 350L, 199L, 74L, 271L, 56L,
360L, 306L, 226L, 72L, 350L, 248L, 90L, 91L, 74L, 44L, 361L,
217L, 357L, 73L, 55L, 191L, 73L, 226L, 347L, 184L, 357L, 95L,
218L, 196L, 249L, 197L, 74L, 74L, 147L, 199L, 145L, 217L, 136L,
295L, 73L, 347L, 223L, 113L, 47L, 350L, 350L, 198L, 310L, 23L,
74L), segment_id = c(3L, 1L, 1L, 1L, 1L, 2L, 1L, 5L, 1L, 1L,
7L, 1L, 2L, 7L, 1L, 1L, 3L, 3L, 7L, 1L, 2L, 1L, 8L, 2L, 11L,
1L, 1L, 3L, 6L, 1L, 1L, 8L, 2L, 2L, 4L, 5L, 3L, 1L, 1L, 1L, 1L,
3L, 1L, 17L, 1L, 3L, 4L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 3L, 3L,
5L, 1L, 1L, 2L, 1L, 1L, 2L, 7L, 4L, 2L, 3L, 1L, 1L, 1L, 3L, 3L,
1L, 6L, 2L, 2L, 5L, 1L, 2L, 5L, 1L, 2L, 3L, 2L, 4L, 3L, 1L, 1L,
2L, 1L, 4L, 13L, 3L, 2L, 1L, 2L, 3L, 6L, 5L, 5L, 3L, 1L, 2L,
7L, 10L, 1L, 1L, 1L, 7L, 4L, 2L, 2L, 1L, 9L, 1L, 1L, 1L, 10L,
3L, 4L, 6L, 1L, 4L, 9L, 1L, 1L, 1L, 10L, 2L, 1L, 4L, 4L, 1L,
1L, 1L, 1L, 1L, 1L, 8L, 1L, 1L, 1L, 7L, 15L, 2L, 8L, 7L, 3L,
6L, 1L, 1L, 1L, 8L, 1L, 23L, 4L, 3L, 2L, 2L, 2L, 2L, 4L, 1L,
1L, 3L, 2L, 5L, 1L, 1L, 6L, 5L, 1L, 12L, 2L, 2L, 1L, 1L, 3L,
1L, 2L, 1L, 2L, 5L, 2L, 1L, 6L, 4L, 2L, 1L, 1L, 1L, 3L, 1L, 1L,
2L, 1L, 4L, 5L, 5L, 7L, 4L, 17L, 1L, 2L, 2L, 1L, 1L, 1L, 3L,
1L, 18L, 4L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 8L, 6L, 2L, 1L,
6L, 1L, 1L, 2L, 1L, 1L, 10L, 1L, 1L, 1L, 2L, 10L, 1L, 15L, 4L,
4L, 3L, 4L, 12L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 11L, 1L, 1L, 2L,
2L, 2L, 7L, 3L, 1L, 2L, 4L, 2L, 2L, 1L, 2L, 16L, 2L, 4L, 1L,
2L, 1L, 1L, 2L, 14L, 1L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 6L,
1L, 1L, 3L, 1L, 2L, 1L, 7L, 2L, 1L, 2L, 2L, 15L, 6L, 1L, 1L,
1L), interval = new("Interval", .Data = c(604800, 86400, 86400,
259200, 604800, 604800, 604800, 604800, 86400, 86400, 604800,
604800, 518400, 604800, 86400, 604800, 604800, 604800, 604800,
518400, 604800, 86400, 604800, 604800, 259200, 604800, 86400,
604800, 604800, 518400, 172800, 604800, 604800, 604800, 172800,
432000, 604800, 604800, 259200, 432000, 86400, 604800, 432000,
604800, 86400, 604800, 604800, 604800, 604800, 86400, 86400,
604800, 604800, 604800, 604800, 172800, 604800, 345600, 518400,
604800, 345600, 604800, 86400, 86400, 604800, 604800, 604800,
604800, 604800, 86400, 86400, 604800, 518400, 604800, 604800,
86400, 604800, 86400, 86400, 604800, 604800, 432000, 604800,
604800, 604800, 604800, 86400, 86400, 259200, 86400, 604800,
604800, 259200, 604800, 604800, 604800, 259200, 604800, 604800,
604800, 604800, 86400, 604800, 604800, 604800, 172800, 604800,
604800, 604800, 432000, 604800, 604800, 86400, 604800, 604800,
518400, 518400, 604800, 604800, 604800, 172800, 604800, 604800,
86400, 604800, 604800, 604800, 604800, 86400, 518400, 604800,
604800, 604800, 518400, 518400, 604800, 86400, 86400, 172800,
604800, 604800, 259200, 604800, 604800, 604800, 604800, 432000,
604800, 604800, 86400, 604800, 432000, 604800, 604800, 604800,
604800, 604800, 86400, 518400, 604800, 604800, 604800, 604800,
518400, 604800, 604800, 604800, 604800, 172800, 604800, 86400,
604800, 604800, 604800, 345600, 604800, 604800, 604800, 604800,
604800, 86400, 86400, 345600, 172800, 172800, 604800, 604800,
518400, 604800, 86400, 604800, 604800, 604800, 172800, 604800,
86400, 86400, 604800, 604800, 604800, 604800, 432000, 604800,
604800, 604800, 172800, 604800, 345600, 604800, 604800, 604800,
604800, 604800, 604800, 172800, 604800, 172800, 86400, 604800,
86400, 604800, 604800, 604800, 604800, 604800, 604800, 604800,
604800, 604800, 86400, 518400, 259200, 604800, 604800, 604800,
604800, 432000, 604800, 604800, 86400, 604800, 604800, 604800,
259200, 86400, 86400, 86400, 518400, 86400, 86400, 604800, 604800,
259200, 345600, 604800, 604800, 604800, 604800, 172800, 604800,
604800, 259200, 604800, 86400, 86400, 604800, 604800, 604800,
86400, 172800, 604800, 86400, 604800, 604800, 604800, 172800,
432000, 604800, 518400, 345600, 518400, 86400, 86400, 604800,
604800, 604800, 604800, 172800, 604800, 86400, 604800, 518400,
86400, 604800, 604800, 518400, 172800, 259200, 86400, 86400),
start = structure(c(1463097600, 1479081600, 1499817600, 1511654400,
1464912000, 1493337600, 1440028800, 1514073600, 1478995200,
1438128000, 1507593600, 1475193600, 1491782400, 1507593600,
1445212800, 1487462400, 1505174400, 1445731200, 1519084800,
1515456000, 1449964800, 1463529600, 1508198400, 1504483200,
1517702400, 1485648000, 1468195200, 1476403200, 1514678400,
1460073600, 1472860800, 1508198400, 1504483200, 1475798400,
1444348800, 1451692800, 1481587200, 1488153600, 1439769600,
1445126400, 1492732800, 1449446400, 1494201600, 1513641600,
1470441600, 1505174400, 1510704000, 1469145600, 1478563200,
1444780800, 1483228800, 1475280000, 1485129600, 1444867200,
1477267200, 1462492800, 1464652800, 1503532800, 1488931200,
1516060800, 1441584000, 1475884800, 1479772800, 1519084800,
1478908800, 1505952000, 1486598400, 1444608000, 1485216000,
1493942400, 1459814400, 1505088000, 1494201600, 1488240000,
1504483200, 1469491200, 1506384000, 1474502400, 1495411200,
1506384000, 1475366400, 1487548800, 1485734400, 1512259200,
1505779200, 1512864000, 1448496000, 1509494400, 1475884800,
1500422400, 1448582400, 1511222400, 1444348800, 1474416000,
1475193600, 1506038400, 1495411200, 1513036800, 1487548800,
1439856000, 1441497600, 1519948800, 1428192000, 1513641600,
1517097600, 1481673600, 1475884800, 1508889600, 1488758400,
1505779200, 1510617600, 1471305600, 1511913600, 1508803200,
1477094400, 1457481600, 1469577600, 1473206400, 1449187200,
1505692800, 1465776000, 1444694400, 1497744000, 1511827200,
1473465600, 1465516800, 1494892800, 1515456000, 1454803200,
1485216000, 1511827200, 1505779200, 1485129600, 1478649600,
1447977600, 1465516800, 1479945600, 1483315200, 1489622400,
1479340800, 1494201600, 1444867200, 1488844800, 1517356800,
1495756800, 1477785600, 1488758400, 1468800000, 1514678400,
1455753600, 1452556800, 1442534400, 1514246400, 1456617600,
1517270400, 1505779200, 1505606400, 1462147200, 1453852800,
1471824000, 1495584000, 1477008000, 1462579200, 1439596800,
1478304000, 1433808000, 1462492800, 1457395200, 1489881600,
1513036800, 1517875200, 1518912000, 1510617600, 1476230400,
1466121600, 1463443200, 1475193600, 1458432000, 1457395200,
1460678400, 1510617600, 1441324800, 1465171200, 1469491200,
1477872000, 1511913600, 1439510400, 1460332800, 1464134400,
1472774400, 1508889600, 1476403200, 1494979200, 1494460800,
1485820800, 1501027200, 1441324800, 1487548800, 1506384000,
1489017600, 1517270400, 1447113600, 1455580800, 1453680000,
1516060800, 1491264000, 1475193600, 1437696000, 1449446400,
1503964800, 1514246400, 1487030400, 1452124800, 1485216000,
1464825600, 1438905600, 1479772800, 1459641600, 1506988800,
1518739200, 1508112000, 1506988800, 1504569600, 1485216000,
1488153600, 1437696000, 1503878400, 1487808000, 1455321600,
1489881600, 1515456000, 1491609600, 1466121600, 1494201600,
1471651200, 1509408000, 1460592000, 1512345600, 1505692800,
1445040000, 1505174400, 1487030400, 1515542400, 1437868800,
1496620800, 1469577600, 1455235200, 1450224000, 1.458e+09,
1450828800, 1510012800, 1485388800, 1476835200, 1487894400,
1447977600, 1510617600, 1507593600, 1474934400, 1438905600,
1504569600, 1477008000, 1443312000, 1443312000, 1484524800,
1464048000, 1517961600, 1463356800, 1517270400, 1494028800,
1441238400, 1488758400, 1452643200, 1470700800, 1511740800,
1461888000, 1508803200, 1441238400, 1463616000, 1455062400,
1471478400, 1460073600, 1494115200, 1463616000, 1488240000,
1460073600, 1448236800, 1463961600, 1447372800, 1485820800,
1496102400, 1507507200, 1489968000, 1499126400, 1444176000,
1504569600, 1512432000, 1463097600, 1490745600, 1440028800,
1496448000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
tzone = "UTC")), row.names = c(NA, -300L), class = c("tbl_df",
"tbl", "data.frame"))
And here is df2
:
structure(list(sequence_id = c(10545297L, 5696697L, 26853675L,
26800598L, 5477912L, 3564676L, 11545989L, 26788357L, 26790778L,
4682984L, 12887744L, 4254651L, 6472328L, 18236650L, 26829066L,
26784117L, 26886686L, 797197L, 26820954L, 26791541L, 11657412L,
3960964L, 10189029L, 21286407L, 12914356L, 26793531L, 26802965L,
12435451L, 5484298L, 26827162L, 26853752L, 25711869L, 9030699L,
14386264L, 26802894L, 26377583L, 13291447L, 1851672L, 26790782L,
9900386L, 26797667L, 6561255L, 26818879L, 11648069L, 14259988L,
26809952L, 26809264L, 15071783L, 26791374L, 26853008L, 6762100L,
26853620L, 26880265L, 26878102L, 26809279L, 26787754L, 5502014L,
17810813L, 18236753L, 5568166L, 9252741L, 26786093L, 18418962L,
1218679L, 26801395L, 16954415L, 26853619L, 26800113L, 26817488L,
26811724L, 26809375L, 26809666L, 5869152L, 7681085L, 26894216L,
15810230L, 26829083L, 26817434L, 26789887L, 26785533L, 26796803L,
26786930L, 26825007L, 26784040L, 26810066L, 26853657L, 18236660L,
26797322L, 26825026L, 4103811L, 26878149L, 10545137L, 26784075L,
26902434L, 3948950L, 26816568L, 11453844L, 26826969L, 26813846L,
26897750L, 26802715L, 26790888L, 26815971L, 26797683L, 4726015L,
4617411L, 26797067L, 9252726L, 26797067L, 26785670L, 26789320L,
26901211L, 26894241L, 499985L, 26825082L, 21774171L, 26803324L,
26815122L, 56056L, 18236919L, 5425808L, 13209778L, 4726052L,
14386262L, 5477952L, 5564830L, 9756473L, 26894173L, 7136912L,
26792378L, 26878986L, 7726907L, 26903079L, 9517618L, 10730383L,
21774142L, 26901299L, 15071807L, 26786514L, 26901389L, 26903784L,
26802651L, 7817686L, 26805379L, 4617432L, 21624158L, 9656749L,
26789389L, 25399602L, 26901650L, 26797702L, 9900332L, 10965877L,
15268795L, 26896376L, 26787716L, 26851798L, 15810222L, 12887738L,
26827055L, 16102402L, 26796994L, 26784422L, 14725739L, 26901257L,
26853712L, 26785221L, 26793075L, 11658007L, 26823570L, 26791524L,
26797467L, 26796972L, 8501567L, 26799777L, 5572466L, 26787249L,
18385461L, 4791179L, 15810380L, 26808430L, 10239023L, 26790569L,
26805358L, 18158022L, 15810244L, 26878116L, 10623114L, 267502L,
9517623L, 16102411L, 26377567L, 8230310L, 13076594L, 26878082L,
415271L, 13833529L, 26823199L, 2410L, 26900200L), upload_id = c(851L,
592L, 2314L, 1799L, 546L, 357L, 925L, 299L, 1611L, 465L, 976L,
424L, 641L, 1249L, 2274L, 1436L, 2556L, 157L, 2166L, 1666L, 928L,
388L, 836L, 1405L, 977L, 1698L, 1928L, 961L, 547L, 2261L, 2316L,
1486L, 774L, 1038L, 1920L, 1503L, 993L, 229L, 1611L, 819L, 1767L,
651L, 2151L, 927L, 1034L, 2049L, 2028L, 1074L, 1629L, 2302L,
666L, 2314L, 2434L, 2387L, 2028L, 392L, 557L, 1217L, 1249L, 564L,
783L, 883L, 1265L, 179L, 1846L, 1159L, 2314L, 1783L, 2138L, 2079L,
2035L, 2045L, 594L, 736L, 2569L, 1102L, 2277L, 2089L, 52L, 1025L,
1746L, 669L, 2230L, 1506L, 2055L, 2314L, 1249L, 1757L, 2230L,
406L, 2387L, 851L, 1506L, 2787L, 385L, 2128L, 922L, 2251L, 2102L,
2711L, 1907L, 1605L, 2125L, 1767L, 459L, 458L, 1746L, 783L, 1746L,
1000L, 98L, 2750L, 2569L, 122L, 2230L, 1416L, 1929L, 2110L, 41L,
1249L, 542L, 985L, 459L, 1038L, 546L, 563L, 815L, 2569L, 681L,
1665L, 2419L, 738L, 2821L, 792L, 879L, 1416L, 2751L, 1074L, 779L,
2755L, 2849L, 1904L, 740L, 1951L, 458L, 1399L, 810L, 98L, 1479L,
2760L, 1767L, 819L, 891L, 1086L, 2693L, 440L, 2292L, 1102L, 976L,
2257L, 1106L, 1746L, 1442L, 1055L, 2751L, 2314L, 1400L, 1680L,
929L, 2194L, 1661L, 1765L, 1746L, 769L, 1774L, 570L, 572L, 1264L,
473L, 1102L, 2009L, 838L, 1586L, 1951L, 1235L, 1102L, 2387L,
864L, 95L, 792L, 1106L, 1503L, 762L, 984L, 2387L, 120L, 1012L,
1681L, 5L, 2722L), taken = structure(c(1461607098, 1357440699,
1497946386, 1480535568, 1450529748, 1446385695, 1463741872, 1444334424,
1479280400, 1449136788, 1462488333, 1448183687, 1454753449, 1467598406,
1497333513, 1475588136, 1507455271, 1440251873, 1494085620, 1481115392,
1463814473, 1441262063, 1461931738, 1471111946, 1462814426, 1482484495,
1488369500, 1463341759, 1451394079, 1496897690, 1499171773, 1478337380,
1459646439, 1465542945, 1487492476, 1478507314, 1465151499, 1440878596,
1479297148, 1461237979, 1484471493, 1455032917, 1493960869, 1462284996,
1465967563, 1490769440, 1490547948, 1458713033, 1480133603, 1498456304,
1454837375, 1497347897, 1502541854, 1499517904, 1490563199, 1443806209,
1451728803, 1469188230, 1468317942, 1452000085, 1459446443, 1462629579,
1469694294, 1438787731, 1486631809, 1469203046, 1497347627, 1485346076,
1493760152, 1491737060, 1490640549, 1490971607, 1452390124, 1458148243,
1506439827, 1465194751, 1497427230, 1493546423, 1437499385, 1465909309,
1479587401, 1455275863, 1494462120, 1475150180, 1486585139, 1497692625,
1467632404, 1483992126, 1494818410, 1443259589, 1499966514, 1461252282,
1476463125, 1517825105, 1439276459, 1492732155, 1463060151, 1496495881,
1492443646, 1513698078, 1487699018, 1478033857, 1493459209, 1484574255,
1445463014, 1445377602, 1482270132, 1459068085, 1482270132, 1465324190,
1437645893, 1516448011, 1506768001, 1439499230, 1495154336, 1475995917,
1487326465, 1492842646, 1437512735, 1471084135, 1451331488, 1464596049,
1445487433, 1465542768, 1450654515, 1450251138, 1458756627, 1505539318,
1456158745, 1481191991, 1502958079, 1456851898, 1519301621, 1460132323,
1462246721, 1475745018, 1516537759, 1459318655, 1460122320, 1514916703,
1520412137, 1488024066, 1458195162, 1487453288, 1445389049, 1474006970,
1459754632, 1438269539, 1477661255, 1516007192, 1484753445, 1461136855,
1463031275, 1466667291, 1509613313, 1441042946, 1497589967, 1465033581,
1462417047, 1496682390, 1467178192, 1481293492, 1469788770, 1462814225,
1516529474, 1498386350, 1470051133, 1481928052, 1463302826, 1495262048,
1480681123, 1483683739, 1481041639, 1459773430, 1484652813, 1451208417,
1451471584, 1467788032, 1445564488, 1466521584, 1490178592, 1461418924,
1478867863, 1486761277, 1470424975, 1465375208, 1499603574, 1462529520,
1438348434, 1460184847, 1467258314, 1478446800, 1457830628, 1464092571,
1499339617, 1439448916, 1465530027, 1491299676, 1431043226, 1511424274
), class = c("POSIXct", "POSIXt"), tzone = "UTC")), row.names = c(NA,
-200L), class = c("tbl_df", "tbl", "data.frame"))
The problem with these subsets is that I'm not sure how much overlap remains between the two of them for the join, but hopefully there will be some. I tried to filter()
one to include upload_id
s from the other, but I get an error saying:
Error in filter_impl(.data, quo) : Column
interval
classes Period and Interval from lubridate are currently not supported.
Sorry this sounds complicated, please let me know if I can clarify this question further. I am truly grateful for your help!
回答1:
You can use the fuzzyjoin
package:
library(BiocManager)
library(lubridate)
library(fuzzyjoin)
colnames(df2) <- c("sequence_id", "upload_id", "start")
df1$start <- int_start(df1$interval)
df1$end <- int_end(df1$interval)
df2$end <- df2$start
df3 <- interval_inner_join(df1, df2, by=c("start", "end")) # let 1 join with 2
来源:https://stackoverflow.com/questions/51412533/joining-data-frames-by-lubridate-date-within-intervals