Outer operation by group in R

人走茶凉 提交于 2021-02-08 08:13:24

问题


My problem involves calculating differences in prices across products for each period. With the sample data below

product = c('A','A','A','B','B','B','C','C','C')
date = as.Date(c('2016-09-12','2016-09-19', '2016-09-26','2016-09-12','2016-09-19', '2016-09-26', '2016-09-12','2016-09-19', '2016-09-26'))
price = as.numeric(c(17, 14.7, 15, 14.69, 14.64, 14.63, 13.15, 13.15, 13.15))

df <- data.frame(product, date, price)

The challenge is in the grouping, without which a simple call to the outer function could do the trick.

melt(outer(df$price, df$price, "-"))

However combining this with the transmute function in dplyr leads to a strange-looking error message "Error: not compatible with STRSXP". Comments online suggest this might be due to a bug in the package.

So I am wondering whether anyone has a neat suggestion for an alternative approach.

Ideally, I am looking for output something also the following lines.

Var1 Var2 Date          value
A    A    '2016-09-12'  0.00
A    B    '2016-09-12'  2.31
A    C    '2016-09-12'  3.85
B    A    '2016-09-12' -2.31
B    B    '2016-09-12'  0.00
B    C    '2016-09-12'  1.54
C    A    '2016-09-12' -3.85
C    B    '2016-09-12' -1.54
C    C    '2016-09-12'  0.00
A    A    '2016-09-19'  0.00
A    B    '2016-09-19'  0.06
A    C    '2016-09-19'  1.55

etc, etc. Appreciate this leaves some redundant pairs, but that makes life easier further down the line.

Thanks in advance for your attention.:)


回答1:


In general, if a data transformation doesn't work with mutate/transform, you can try do:

> library(dplyr)
> df %>% 
   group_by(date) %>% 
   do(reshape2::melt(outer(.$price, .$price, "-")))

Source: local data frame [27 x 4]
Groups: date [3]

         date  Var1  Var2 value
       (date) (int) (int) (dbl)
1  2016-09-12     1     1  0.00
2  2016-09-12     2     1 -2.31
3  2016-09-12     3     1 -3.85
4  2016-09-12     1     2  2.31
5  2016-09-12     2     2  0.00
6  2016-09-12     3     2 -1.54
7  2016-09-12     1     3  3.85
8  2016-09-12     2     3  1.54
9  2016-09-12     3     3  0.00
10 2016-09-19     1     1  0.00
..        ...   ...   ...   ...



回答2:


We can use data.table

library(data.table)
res <- setDT(df)[, melt(outer(price, price, "-")) , by = date]
res[, c("Var1", "Var2") := lapply(.SD, function(x)
                unique(df$product)[x]),.SDcols = Var1:Var2]

head(res)
#         date Var1 Var2 value
#1: 2016-09-12    A    A  0.00
#2: 2016-09-12    B    A -2.31
#3: 2016-09-12    C    A -3.85
#4: 2016-09-12    A    B  2.31
#5: 2016-09-12    B    B  0.00
#6: 2016-09-12    C    B -1.54

An option using tidyr/dplyr

library(tidyr)
library(dplyr)
df %>%
   group_by(date) %>% 
   expand(price, price2=price) %>% 
   mutate(value = price-price2)


来源:https://stackoverflow.com/questions/41014020/outer-operation-by-group-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!