Supermarket dataset for Apriori algorithm

白昼怎懂夜的黑 提交于 2019-12-05 01:30:52

问题


'I have to develop a software which is meant for Business Analyst of “Future Stores” Supermarket, the software performs the Association Rule Mining on given transitional data of supermarket sales transactions and prepares Discounting policy by preparing Combo. The software makes use of the data mining algorithms namely Apriori Algorithm. The Association Rules will be displayed in User friendly manner for generation of discounting policy based on positive association rules.'

From where can I get the supermarket dataset to check the Apriori algorithm which i have coded?


回答1:


To get a market dataset, you can go here : fimi.ua.ac.be/data/ and download the retail dataset.

It is an anonymized datasets of transactions from a belgian store.

It is perfect for testing Apriori or other frequent itemset mining and association rule mining algorithms.




回答2:


Instead of looking for a real-world dataset, you should design a small, specific dataset for each unit test. The dataset should provide the minimal necessary precondition to verify a single feature of the system. This will make it easier to detect bugs, maintain tests over time, and demonstrate the capabilities and usage patterns of the system to other developers.

An example from a different domain would be tests for a User Subsystem that creates and validates logins to a website.

  • addsNewUser - empty dataset
  • throwsExceptionForDuplicateUsername - single-user dataset
  • correctPasswordPasses - same dataset
  • throwsExceptionForIncorrectUsername - same dataset
  • throwsExceptionForIncorrectPassword - same dataset
  • throwsExceptionWhenNewUsernameExists - two-user dataset

Update: If you need a very large dataset to perform integration or performance testing, you are probably left with writing a program to generate a random collection of purchases. I doubt any existing supermarkets are willing (or able) to part with their real datasets.

That being said, while working as a contractor for a health insurance provider many years ago (pre-HIPAA) I was given a sample dataset to work with. It contained real patient information including SSNs and confidential medical history. :(



来源:https://stackoverflow.com/questions/9754769/supermarket-dataset-for-apriori-algorithm

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!