问题
I'm really sorry to ask this question again, because there are already many questions about this. However, none of the solutions worked for my problem.
My data looks like this:
id scale rater rating
1 A 1 5
1 B 1 7
1 A 2 3
1 B 2 6
2 A 1 4
2 B 1 3
2 A 2 2
2 B 2 1
I want to spread(rater, rating)
In the end it should look like this:
id scale 1 2
1 A 5 3
1 B 7 6
2 A 4 2
2 B 3 1
The problem obviously is that the rows in the first dataset don't have unique identifiers. Looking at answers to similar questions, none of the solutions seem to work for me. I can't just delete duplicate rows and when using row numbers or grouped identifiers group_by(id) %>%
mutate (grouped_id = row_number())
I don't get the two raters put in one column, but a row each with NA for the rating of the other rater.
I feel like I tried everything I could find and would really appreciate some help! Thank you very much in advance!
回答1:
We can use the spread function, without having to group_by
anything (thanks @Jaap):
library(tidyr)
dat %>%
spread(rater, rating)
# A tibble: 4 x 4
id scale `1` `2`
<int> <chr> <int> <int>
1 1 A 5 3
2 1 B 7 6
3 2 A 4 2
4 2 B 3 1
Edit using reshape
Although I would almost never advise using the reshape
function over the gather
and spread
functions, here's how you could do it using base R:
reshape(dat, direction = 'wide',
idvar = c('id','scale'),
v.names = 'rating',
timevar = 'rater')
id scale rating.1 rating.2
1 1 A 5 3
2 1 B 7 6
5 2 A 4 2
6 2 B 3 1
Data
dat <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
scale = c("A", "B", "A", "B", "A", "B", "A", "B"),
rater = c(1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L),
rating = c(5L, 7L, 3L, 6L, 4L, 3L, 2L, 1L)),
class = "data.frame", row.names = c(NA, -8L))
来源:https://stackoverflow.com/questions/51192050/spread-for-duplicate-identifiers