R: Pivoting using 'spread' function

牧云@^-^@ 提交于 2019-11-27 07:30:00

问题


Continuing from my previous post, I am now having 1 more column of ID values that I need to use to pivot rows into columns.

    NUM <- c(1,2,3,1,2,3,1,2,3,1)
    ID <- c("DJ45","DJ45","DJ45","DJ46","DJ46","DJ46","DJ47","DJ47","DJ47","DJ48")
    Type <- c("A", "F", "C", "B", "D", "A", "E", "C", "F", "D")
    Points <- c(9.2,60.8,22.9,1012.7,18.7,11.1,67.2,63.1,16.7,58.4)

    df1 <- data.frame(ID,NUM,Type,Points)

df1:
    +------+-----+------+--------+
    | ID   | Num | Type | Points |
    +------+-----+------+--------+
    | DJ45 |   1 | A    | 9.2    |
    | DJ45 |   2 | F    | 60.8   |
    | DJ45 |   3 | C    | 22.9   |
    | DJ46 |   1 | B    | 1012.7 |
    | DJ46 |   2 | D    | 18.7   |
    | DJ46 |   3 | A    | 11.1   |
    | DJ47 |   1 | E    | 67.2   |
    | DJ47 |   2 | C    | 63.1   |
    | DJ47 |   3 | F    | 16.7   |
    | DJ48 |   1 | D    | 58.4   |
    +------+-----+------+--------+

My desired output is

+------+-----+------+--------+------+------+------+------+
| ID   | Num |  A   |   B    |  C   |  D   |  E   |  F   |
+------+-----+------+--------+------+------+------+------+
| DJ45 |   1 | 9.2  | N/A    | N/A  | N/A  | N/A  | N/A  |
| DJ45 |   2 | N/A  | N/A    | N/A  | N/A  | N/A  | 60.8 |
| DJ45 |   3 | N/A  | N/A    | 22.9 | N/A  | N/A  | N/A  |
| DJ46 |   1 | N/A  | 1012.7 | N/A  | N/A  | N/A  | N/A  |
| DJ46 |   2 | N/A  | N/A    | N/A  | 18.7 | N/A  | N/A  |
| DJ46 |   3 | 11.1 | N/A    | N/A  | N/A  | N/A  | N/A  |
| DJ47 |   1 | N/A  | N/A    | N/A  | N/A  | 67.2 | N/A  |
| DJ47 |   2 | N/A  | N/A    | 63.1 | N/A  | N/A  | N/A  |
| DJ47 |   3 | N/A  | N/A    | N/A  | N/A  | N/A  | 16.7 |
| DJ48 |   1 | N/A  | N/A    | N/A  | 58.4 | N/A  | N/A  |
+------+-----+------+--------+------+------+------+------+

I am using spread function in R but getting errors saying duplicate identifiers. This is because I have 2 columns now (ID & NUM) instead of one (NUM) that I had previously. Please let me know how I could do this.


回答1:


Not knowing what you've tried, I would suggest:

spread(df1, Type, Points)
#      ID NUM    A      B    C    D    E    F
# 1  DJ45   1  9.2     NA   NA   NA   NA   NA
# 2  DJ45   2   NA     NA   NA   NA   NA 60.8
# 3  DJ45   3   NA     NA 22.9   NA   NA   NA
# 4  DJ46   1   NA 1012.7   NA   NA   NA   NA
# 5  DJ46   2   NA     NA   NA 18.7   NA   NA
# 6  DJ46   3 11.1     NA   NA   NA   NA   NA
# 7  DJ47   1   NA     NA   NA   NA 67.2   NA
# 8  DJ47   2   NA     NA 63.1   NA   NA   NA
# 9  DJ47   3   NA     NA   NA   NA   NA 16.7
# 10 DJ48   1   NA     NA   NA 58.4   NA   NA

If you are getting an error about duplicate identifiers, it is because the combination of "ID" and "Num" in your actual data have one or more duplicate entries (in your sample data, they don't).

If that is the case, you need to add another column to make them unique.

Adding dplyr into the chain, it might be something like:

df1 %>%
  group_by(ID, NUM) %>%
  mutate(id2 = sequence(n())) %>%
  spread(Type, Points)

Demo of assumed error:

df2 <- rbind(df1, df1[1:3, ]) ## Duplicate the first three rows
spread(df2, Type, Points)
# Error: Duplicate identifiers for rows (1, 11), (3, 13), (2, 12)    

library(dplyr)

df2 %>%
  group_by(ID, NUM) %>%
  mutate(id2 = sequence(n())) %>%
  spread(Type, Points)
# Source: local data frame [13 x 9]
# 
#      ID NUM id2    A      B    C    D    E    F
# 1  DJ45   1   1  9.2     NA   NA   NA   NA   NA
# 2  DJ45   1   2  9.2     NA   NA   NA   NA   NA
# 3  DJ45   2   1   NA     NA   NA   NA   NA 60.8
# 4  DJ45   2   2   NA     NA   NA   NA   NA 60.8
# 5  DJ45   3   1   NA     NA 22.9   NA   NA   NA
# 6  DJ45   3   2   NA     NA 22.9   NA   NA   NA
# 7  DJ46   1   1   NA 1012.7   NA   NA   NA   NA
# 8  DJ46   2   1   NA     NA   NA 18.7   NA   NA
# 9  DJ46   3   1 11.1     NA   NA   NA   NA   NA
# 10 DJ47   1   1   NA     NA   NA   NA 67.2   NA
# 11 DJ47   2   1   NA     NA 63.1   NA   NA   NA
# 12 DJ47   3   1   NA     NA   NA   NA   NA 16.7
# 13 DJ48   1   1   NA     NA   NA 58.4   NA   NA


来源:https://stackoverflow.com/questions/29952517/r-pivoting-using-spread-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!