问题
I am using the arules
package to find association rules in point-of-sale retail data. I am extracting transaction detail from a database, then placing in a transaction
object. I'm new to arules
and am trying to figure out how to populate the itemInfo
data frame in the transaction object. Right now, I'm just bringing in the transaction
and item ID
s (both numeric), which provide little context. I would like to be able to add an item description, as well as product hierarchy levels.
Below is the process I'm using today:
Data comes through from the database in the below format:
Transaction_ID Item_ID -------------- ----------- 100 1 100 2 100 3 101 2 101 3 102 1 102 2
To create the
transaction
object, I'm using the below command, as described in thearules
documentation:txdata <- as(split(txdata[, "Item_ID"], txdata[, "Transaction_ID"]), "transactions")
Note: I've found that I need to have a numeric value for the
Item_ID
, otherwise I run into major performance issues using a string (due to poor performance of split when using factored strings).Create and view the association rules
rules <- apriori(txdata, parameter = list(support=0.00015, confidence=0.5)) inspect(head((sort(rules, by="confidence")), n=5))
When the rules come back, they are listed by Item_ID
, which is not helpful to me. I want to be able to display them by the ID
and/or description. Also, would like to take advantage of the aggregation features built into the arules
package.
回答1:
You can change the names of items using itemInfo. Here is an example:
R> df <- data.frame(
TID = c(1,1,2,2,2,3),
item=c("a","b","a","b","c", "b")
)
R> trans <- as(split(df[,"item"], df[,"TID"]), "transactions")
### this is how you replace item labels and set a hierachy (here level1)
R> myLabels <- c("milk", "butter", "beer")
R> myLevel1 <- c("dairy", "dairy", "beverage")
R> itemInfo(trans) <- data.frame(labels = myLabels, level1 = myLevel1)
R> inspect(trans)
items transactionID
1 {milk,
butter} 1
2 {milk,
butter,
beer} 2
3 {butter} 3
### now you can use aggregate()
R> inspect(aggregate(trans, itemInfo(trans)[["level1"]]))
items transactionID
1 {dairy} 1
2 {beverage,
dairy} 2
3 {dairy} 3
You can find more info using class? transactions
and ? aggregate
.
Hope this helps, Michael
来源:https://stackoverflow.com/questions/28952011/adding-item-information-to-transaction-object-in-arules