问题
I'm a beginner when it comes to R. But, I want to learn more. I'm trying to do a market bench analysis.
This is my raw data and I want to convert this to a transactions basket format:
This is what I am trying to achieve:
I have tried :
trans <- as(split(a[,"Game.played"],a[,"sessionid"]),"transactions")
But, instead of the name of the game, the number of the game is only displayed. Could anyone tell me why this is happening? Also, I have cross verifies the actual data, and the association of the sessionid with the game is wrong!
I have also tried something like
q=read.transactions("a.csv", format = "basket", sep=",", rm.duplicates=TRUE).
But, this is not working out either.
回答1:
data into basket for arules, removing duplicates?
Here's an example on how you could remove the duplicates:
set.seed(1)
df <- data.frame(
cat=rep(LETTERS[1:3], 2:4),
val=sample(letters[1:5], 9, T),
stringsAsFactors = FALSE
)
df
# cat val
# 1 A b
# 2 A b
# 3 B c
# 4 B e
# 5 B b
# 6 C e
# 7 C e
# 8 C d
# 9 C d
(lst <- lapply(split(df$val, df$cat), unique))
# $A
# [1] "b"
#
# $B
# [1] "c" "e" "b"
#
# $C
# [1] "e" "d"
library(arules)
as(lst, "transactions")
# transactions in sparse format with
# 3 transactions (rows) and
# 4 items (columns)
来源:https://stackoverflow.com/questions/42258920/data-csv-into-basket-for-arules-removing-duplicates