问题
I have data in which subjects completed multiple ratings per day over 6-7 days. The number of ratings per day varies. The data set includes subject ID, date, and the ratings. I would like to create a new variable that recodes the dates for each subject into "study day" --- so 1 for first day of ratings, 2 for second day of ratings, etc.
For example, I would like to take this:
id Date Rating
1 10/20/2018 2
1 10/20/2018 3
1 10/20/2018 5
1 10/21/2018 1
1 10/21/2018 7
1 10/21/2018 9
1 10/22/2018 4
1 10/22/2018 5
1 10/22/2018 9
2 11/15/2018 1
2 11/15/2018 3
2 11/15/2018 4
2 11/16/2018 3
2 11/16/2018 1
2 11/17/2018 0
2 11/17/2018 2
2 11/17/2018 9
And end up with this:
id Day Date Rating
1 1 10/20/2018 2
1 1 10/20/2018 3
1 1 10/20/2018 5
1 2 10/21/2018 1
1 2 10/21/2018 7
1 2 10/21/2018 9
1 3 10/22/2018 4
1 3 10/22/2018 5
1 3 10/22/2018 9
2 1 11/15/2018 1
2 1 11/15/2018 3
2 1 11/15/2018 4
2 2 11/16/2018 3
2 2 11/16/2018 1
2 3 11/17/2018 0
2 3 11/17/2018 2
2 3 11/17/2018 9
I was going to look into setting up some kind of loop, but I thought it would be worth asking if there is a more efficient way to pull this off. Are there any functions that would allow me to automate this sort of thing? Thanks very much for any suggestions.
回答1:
Since you want to reset the count after every id
, makes this question a bit different.
Using only base R, we can split
the Date
based on id
and then create a count of each distinct group.
df$Day <- unlist(sapply(split(df$Date, df$id), function(x) match(x,unique(x))))
df
# id Date Rating Day
#1 1 10/20/2018 2 1
#2 1 10/20/2018 3 1
#3 1 10/20/2018 5 1
#4 1 10/21/2018 1 2
#5 1 10/21/2018 7 2
#6 1 10/21/2018 9 2
#7 1 10/22/2018 4 3
#8 1 10/22/2018 5 3
#9 1 10/22/2018 9 3
#10 2 11/15/2018 1 1
#11 2 11/15/2018 3 1
#12 2 11/15/2018 4 1
#13 2 11/16/2018 3 2
#14 2 11/16/2018 1 2
#15 2 11/17/2018 0 3
#16 2 11/17/2018 2 3
#17 2 11/17/2018 9 3
I don't know how I missed this but thanks to @thelatemail who reminded that this is basically the same as
library(dplyr)
df %>%
group_by(id) %>%
mutate(Day = match(Date, unique(Date)))
AND
df$Day <- as.numeric(with(df, ave(Date, id, FUN = function(x) match(x, unique(x)))))
回答2:
If you want a slightly hacky dplyr
version....you can use the date column and convert it to a numeric date then manipulate that number to give the desired result
library(tidyverse)
library(lubridate)
df <- data_frame(id=c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2),
Date= c('10/20/2018', '10/20/2018', '10/20/2018', '10/21/2018', '10/21/2018', '10/21/2018',
'10/22/2018', '10/22/2018', '10/22/2018','11/15/2018', '11/15/2018', '11/15/2018',
'11/16/2018', '11/16/2018', '11/17/2018', '11/17/2018', '11/17/2018'),
Rating=c(2,3,5,1,7,9,4,5,9,1,3,4,3,1,0,2,9))
df %>%
group_by(id) %>%
mutate(
Date = mdy(Date),
Day = as.numeric(Date),
Day = Day-min(Day)+1)
# A tibble: 17 x 4
# Groups: id [2]
id Date Rating Day
<dbl> <date> <dbl> <dbl>
1 1 2018-10-20 2 1
2 1 2018-10-20 3 1
3 1 2018-10-20 5 1
4 1 2018-10-21 1 2
5 1 2018-10-21 7 2
6 1 2018-10-21 9 2
7 1 2018-10-22 4 3
8 1 2018-10-22 5 3
9 1 2018-10-22 9 3
10 2 2018-11-15 1 1
11 2 2018-11-15 3 1
12 2 2018-11-15 4 1
13 2 2018-11-16 3 2
14 2 2018-11-16 1 2
15 2 2018-11-17 0 3
16 2 2018-11-17 2 3
17 2 2018-11-17 9 3
来源:https://stackoverflow.com/questions/53808031/recode-dates-to-study-day-within-subject