R Reshape data frame from long to wide format?

走远了吗. 提交于 2019-12-27 10:34:27

问题


What's the best way to convert the data frame below from long to wide format? I tried to use reshape but didn't get the desired results.

2015    PROD A  test1
2015    PROD A  blue
2015    PROD A  50
2015    PROD A  66
2015    PROD A  66
2018    PROD B  test2
2018    PROD B  yellow
2018    PROD B  70
2018    PROD B  88.8
2018    PROD B  88.8
2018    PROD A  test3
2018    PROD A  red
2018    PROD A  55
2018    PROD A  88
2018    PROD A  90


回答1:


A possible solution is this

library(tidyverse)

df = read.table(text = "
                year prod value
                2015    PRODA  test1
                2015    PRODA  blue
                2015    PRODA  50
                2015    PRODA  66
                2015    PRODA  66
                2018    PRODB  test2
                2018    PRODB  yellow
                2018    PRODB  70
                2018    PRODB  88.8
                2018    PRODB  88.8
                2018    PRODA  test3
                2018    PRODA  red
                2018    PRODA  55
                2018    PRODA  88
                2018    PRODA  90
                ", header=T, stringsAsFactors=F)

df %>%
  group_by(year, prod) %>%                           # for each year and prod combination
  mutate(id = paste0("new_col_", row_number())) %>%  # enumerate rows (this will be used as column names in the reshaped version)
  ungroup() %>%                                      # forget the grouping
  spread(id, value)                                  # reshape

# # A tibble: 3 x 7
#    year prod  new_col_1 new_col_2 new_col_3 new_col_4 new_col_5
#   <int> <chr> <chr>     <chr>     <chr>     <chr>     <chr>    
# 1  2015 PRODA test1     blue      50        66        66       
# 2  2018 PRODA test3     red       55        88        90       
# 3  2018 PRODB test2     yellow    70        88.8      88.8 



回答2:


For the sake of completeness, here is a solution which uses data.table's convenient rowid() function.

The crucial point of the question is that the reshaping solely depends on the row position of value within each (year, product) group. rowid(year, product) numbers the rows within each group. So, reshaping essentially becomes a one-liner:

library(data.table)
dcast(setDT(df1), year + product ~ rowid(year, product, prefix = "col_"))
   year product col_1  col_2 col_3 col_4 col_5
1: 2015  PROD A test1   blue    50    66    66
2: 2018  PROD A test3    red    55    88    90
3: 2018  PROD B test2 yellow    70  88.8  88.8

Note that rowid() takes a prefix parameter to ensure that the resulting column names are syntactically correct.

Caveat: This solution assumes that year and product form a unique key for each group.

Data

The data are read as posted by th OP without any modifications to the data. However, this requires a few lines of post-processing:

library(data.table)    
df1 <- fread("
2015    PROD A  test1
2015    PROD A  blue
2015    PROD A  50
2015    PROD A  66
2015    PROD A  66
2018    PROD B  test2
2018    PROD B  yellow
2018    PROD B  70
2018    PROD B  88.8
2018    PROD B  88.8
2018    PROD A  test3
2018    PROD A  red
2018    PROD A  55
2018    PROD A  88
2018    PROD A  90", 
      header = FALSE, col.names = c("year", "product", "value"), drop = 2L)[
        , product := paste("PROD", product)][]



回答3:


You're looking for the dcast function. Used like:

dcast(data, col1 + col2 ~ col3)

This question may also be a duplicate, so it may be taken down.



来源:https://stackoverflow.com/questions/50765323/r-reshape-data-frame-from-long-to-wide-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!