Supermarket dataset for Apriori algorithm

守給你的承諾、 提交于 2019-12-03 15:36:52

To get a market dataset, you can go here : fimi.ua.ac.be/data/ and download the retail dataset.

It is an anonymized datasets of transactions from a belgian store.

It is perfect for testing Apriori or other frequent itemset mining and association rule mining algorithms.

Instead of looking for a real-world dataset, you should design a small, specific dataset for each unit test. The dataset should provide the minimal necessary precondition to verify a single feature of the system. This will make it easier to detect bugs, maintain tests over time, and demonstrate the capabilities and usage patterns of the system to other developers.

An example from a different domain would be tests for a User Subsystem that creates and validates logins to a website.

  • addsNewUser - empty dataset
  • throwsExceptionForDuplicateUsername - single-user dataset
  • correctPasswordPasses - same dataset
  • throwsExceptionForIncorrectUsername - same dataset
  • throwsExceptionForIncorrectPassword - same dataset
  • throwsExceptionWhenNewUsernameExists - two-user dataset

Update: If you need a very large dataset to perform integration or performance testing, you are probably left with writing a program to generate a random collection of purchases. I doubt any existing supermarkets are willing (or able) to part with their real datasets.

That being said, while working as a contractor for a health insurance provider many years ago (pre-HIPAA) I was given a sample dataset to work with. It contained real patient information including SSNs and confidential medical history. :(

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!