问题
I have a dataset (CSV file) to find frequent itemsets using Apriori algorithm.
col1, col2, col3
bread, butter,?
coke, bread, butter
I am using WEKA for this purpose. The ouput is in the following format:
...
Large Itemsets L(2):
col1=bread col2= butter 1
col1=coke col2= bread 1
col1=coke col3= butter 1
col2= bread col3= butter 1
...
But the output that I am want is :
bread, butter 2
Basically, the above output is independent of the col
that they belong to. How can I achieve this kind of output?
回答1:
Format your data differently.
Weka expects columns to be the same products, and the value to be t/f (for true, false). Then you get itemset of the kind milk=t -> butter=t.
See the .arff examples included with Weka.
I think I saw an ELKI example using your input format.
来源:https://stackoverflow.com/questions/35741464/how-to-find-frequent-itemset-irrespective-of-attribute-name