问题
library(tidyverse)
library(lubridate)
library(padr)
df
#> # A tibble: 828 x 5
#> Scar_Id Code Type Value YrMo
#> <chr> <chr> <chr> <date> <date>
#> 1 0070-179 AA Start_Date 2020-04-22 2020-04-01
#> 2 0070-179 AA Closure_Date 2020-05-23 2020-05-01
#> 3 1139-179 AA Start_Date 2020-04-23 2020-04-01
#> 4 1139-179 AA Closure_Date 2020-05-23 2020-05-01
#> 5 262-179 AA Start_Date 2019-08-29 2019-08-01
#> 6 262-179 AA Closure_Date 2020-05-23 2020-05-01
#> 7 270-179 AA Start_Date 2019-08-29 2019-08-01
#> 8 270-179 AA Closure_Date 2020-05-23 2020-05-01
#> 9 476-179 BB Start_Date 2019-09-04 2019-09-01
#> 10 476-179 BB Closure_Date 2019-11-04 2019-11-01
#> # ... with 818 more rows
I have an R data frame named df
shown above. I want to concentrate on row numbers 5
and 6
. I can usually use the package padr to pad the months in between rows 5
and 6
. The pad()
function of the padr will basically add rows at intervals the user specifies, best shown as the added rows "X"
below.
#> 1 0070-179 AA Start_Date 2020-04-22 2020-04-01
#> 2 0070-179 AA Closure_Date 2020-05-23 2020-05-01
#> 3 1139-179 AA Start_Date 2020-04-23 2020-04-01
#> 4 1139-179 AA Closure_Date 2020-05-23 2020-05-01
#> 5 262-179 AA Start_Date 2019-08-29 2019-08-01
#> X 262-179 NA NA NA 2019-09-01
#> X 262-179 NA NA NA 2019-10-01
#> X 262-179 NA NA NA 2019-11-01
#> X 262-179 NA NA NA 2019-12-01
#> X 262-179 NA NA NA 2020-01-01
#> X 262-179 NA NA NA 2020-02-01
#> X 262-179 NA NA NA 2020-03-01
#> X 262-179 NA NA NA 2020-04-01
#> 6 262-179 AA Closure_Date 2020-05-23 2020-05-01
#> 7 270-179 AA Start_Date 2019-08-29 2019-08-01
#> 8 270-179 AA Closure_Date 2020-05-23 2020-05-01
#> 9 476-179 BB Start_Date 2019-09-04 2019-09-01
#> 10 476-179 BB Closure_Date 2019-11-04 2019-11-01
To get there I usually issue a command, such as is shown below, and it works fine in padr. But it doesn't work in my specific example, and instead yields the warning shown below.
df %>% pad(group = "Scar_Id", by = "YrMo", interval = "month")
#> # A tibble: 828 x 5
#> Scar_Id Code Type Value YrMo
#> <chr> <chr> <chr> <date> <date>
#> 1 0070-179 AA Start_Date 2020-04-22 2020-04-01
#> 2 0070-179 AA Closure_Date 2020-05-23 2020-05-01
#> 3 1139-179 AA Start_Date 2020-04-23 2020-04-01
#> 4 1139-179 AA Closure_Date 2020-05-23 2020-05-01
#> 5 262-179 AA Start_Date 2019-08-29 2019-08-01
#> 6 262-179 AA Closure_Date 2020-05-23 2020-05-01
#> 7 270-179 AA Start_Date 2019-08-29 2019-08-01
#> 8 270-179 AA Closure_Date 2020-05-23 2020-05-01
#> 9 476-179 BB Start_Date 2019-09-04 2019-09-01
#> 10 476-179 BB Closure_Date 2019-11-04 2019-11-01
#> # ... with 818 more rows
#> Warning message:
#> datetime variable does not vary for 537 of the groups, no padding applied on this / these group(s)
Why does it claim that "the datetime variable does not vary" for rows 5
and 6
, when the datetime does indeed vary. The datetime for row 5
variable YrMo
is "2019-08-01" and the datetime for row 6
variable YrMo
is "2020-05-01". Let me state the obvious that "2019-08-01" varies from "2020-05-01".
Any ideas what went wrong? I tried to create a reproducible example and could not. The basic examples I created all work as expected (as I describe). Hopefully these clues can help somebody determine what is going on.
来源:https://stackoverflow.com/questions/61418383/rs-padr-package-claiming-the-datetime-variable-does-not-vary-when-it-does-var