问题
So I want to find patterns and "clusters" based on what items that are bought together, and according to the wiki for eclat:
The Eclat algorithm is used to perform itemset mining. Itemset mining let us find frequent patterns in data like if a consumer buys milk, he also buys bread. This type of pattern is called association rules and is used in many application domains.
Though, when I use the eclat in R, i get "zero frequent items" and "NULL" when when retrieving the results through tidLists. Anyone can see what I am doing wrong?
The full dataset: https://pastebin.com/8GbjnHK2
Each row is a transactions, containing different items in the columns. Quick snap of the data:
3060615;;;;;;;;;;;;;;;
3060612;3060616;;;;;;;;;;;;;;
3020703;;;;;;;;;;;;;;;
3002469;;;;;;;;;;;;;;;
3062800;;;;;;;;;;;;;;;
3061943;3061965;;;;;;;;;;;;;;
The code
trans = read.transactions("Transactions.csv", format = "basket", sep = ";")
f <- eclat(trans, parameter = list(supp = 0.1, maxlen = 17, tidLists = TRUE))
dim(tidLists(f))
as(tidLists(f), "list")
Could it be due to the data structure? In that case, how should I change it? Furthermore, what do I do to get the suggested itemsets? I couldn't figure that out from the wiki.
EDIT: I used 0.004 for supp, as suggested by @hpesoj626. But it seems like the function is grouping the orders/users and not the items. I don't know how to export the data, so here is a picture of the tidLists:
回答1:
The problem is that you have set your support too high. Try adjusting supp
say, supp = .001
, for which we get
dim(tidLists(f))
# [1] 928 15840
For your data set, the highest support is 0.08239 which is below 0.1. That is why you are getting no results with supp = 0.1
.
inspect(head(sort(f, by = "support"), 10))
# items support count
# [1] {3060620} 0.08239 1305
# [2] {3060619} 0.07260 1150
# [3] {3061124} 0.05688 901
# [4] {3060618} 0.05663 897
# [5] {4027039} 0.04975 788
# [6] {3060617} 0.04564 723
# [7] {3061697} 0.04306 682
# [8] {3060619,3060620} 0.03087 489
# [9] {3039715} 0.02727 432
# [10] {3045117} 0.02708 429
来源:https://stackoverflow.com/questions/50552493/zero-frequent-items-when-using-the-eclat-to-mine-frequent-itemsets