How to find frequent itemset irrespective of attribute name?

*爱你&永不变心* 提交于 2019-12-12 04:18:11

问题


I have a dataset (CSV file) to find frequent itemsets using Apriori algorithm.

col1, col2, col3
bread, butter,?
coke, bread, butter

I am using WEKA for this purpose. The ouput is in the following format:

...
Large Itemsets L(2):
col1=bread  col2= butter 1
col1=coke  col2= bread 1
col1=coke  col3= butter 1
col2= bread  col3= butter 1
...

But the output that I am want is :

bread, butter 2

Basically, the above output is independent of the col that they belong to. How can I achieve this kind of output?


回答1:


Format your data differently.

Weka expects columns to be the same products, and the value to be t/f (for true, false). Then you get itemset of the kind milk=t -> butter=t.

See the .arff examples included with Weka.

I think I saw an ELKI example using your input format.



来源:https://stackoverflow.com/questions/35741464/how-to-find-frequent-itemset-irrespective-of-attribute-name

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!