问题
What's the best way to convert the data frame below from long to wide format? I tried to use reshape but didn't get the desired results.
2015 PROD A test1
2015 PROD A blue
2015 PROD A 50
2015 PROD A 66
2015 PROD A 66
2018 PROD B test2
2018 PROD B yellow
2018 PROD B 70
2018 PROD B 88.8
2018 PROD B 88.8
2018 PROD A test3
2018 PROD A red
2018 PROD A 55
2018 PROD A 88
2018 PROD A 90
回答1:
A possible solution is this
library(tidyverse)
df = read.table(text = "
year prod value
2015 PRODA test1
2015 PRODA blue
2015 PRODA 50
2015 PRODA 66
2015 PRODA 66
2018 PRODB test2
2018 PRODB yellow
2018 PRODB 70
2018 PRODB 88.8
2018 PRODB 88.8
2018 PRODA test3
2018 PRODA red
2018 PRODA 55
2018 PRODA 88
2018 PRODA 90
", header=T, stringsAsFactors=F)
df %>%
group_by(year, prod) %>% # for each year and prod combination
mutate(id = paste0("new_col_", row_number())) %>% # enumerate rows (this will be used as column names in the reshaped version)
ungroup() %>% # forget the grouping
spread(id, value) # reshape
# # A tibble: 3 x 7
# year prod new_col_1 new_col_2 new_col_3 new_col_4 new_col_5
# <int> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 2015 PRODA test1 blue 50 66 66
# 2 2018 PRODA test3 red 55 88 90
# 3 2018 PRODB test2 yellow 70 88.8 88.8
回答2:
For the sake of completeness, here is a solution which uses data.table
's convenient rowid()
function.
The crucial point of the question is that the reshaping solely depends on the row position of value
within each (year
, product
) group. rowid(year, product)
numbers the rows within each group. So, reshaping essentially becomes a one-liner:
library(data.table)
dcast(setDT(df1), year + product ~ rowid(year, product, prefix = "col_"))
year product col_1 col_2 col_3 col_4 col_5 1: 2015 PROD A test1 blue 50 66 66 2: 2018 PROD A test3 red 55 88 90 3: 2018 PROD B test2 yellow 70 88.8 88.8
Note that rowid()
takes a prefix
parameter to ensure that the resulting column names are syntactically correct.
Caveat: This solution assumes that year
and product
form a unique key for each group.
Data
The data are read as posted by th OP without any modifications to the data. However, this requires a few lines of post-processing:
library(data.table)
df1 <- fread("
2015 PROD A test1
2015 PROD A blue
2015 PROD A 50
2015 PROD A 66
2015 PROD A 66
2018 PROD B test2
2018 PROD B yellow
2018 PROD B 70
2018 PROD B 88.8
2018 PROD B 88.8
2018 PROD A test3
2018 PROD A red
2018 PROD A 55
2018 PROD A 88
2018 PROD A 90",
header = FALSE, col.names = c("year", "product", "value"), drop = 2L)[
, product := paste("PROD", product)][]
回答3:
You're looking for the dcast
function. Used like:
dcast(data, col1 + col2 ~ col3)
This question may also be a duplicate, so it may be taken down.
来源:https://stackoverflow.com/questions/50765323/r-reshape-data-frame-from-long-to-wide-format