问题
So a recent question made me aware of the rather cool apriori algorithm. I can see why it works, but what I'm not sure about is practical uses. Presumably the main reason to compute related sets of items is to be able to provide recommendations for someone based on their own purchases (or owned items, etcetera). But how do you go from a set of related sets of items to individual recommendations?
The Wikipedia article finishes:
The second problem is to generate association rules from those large itemsets with the constraints of minimal confidence. Suppose one of the large itemsets is Lk, Lk = {I1, I2, … , Ik}, association rules with this itemsets are generated in the following way: the first rule is {I1, I2, … , Ik-1}⇒ {Ik}, by checking the confidence this rule can be determined as interesting or not. Then other rule are generated by deleting the last items in the antecedent and inserting it to the consequent, further the confidences of the new rules are checked to determine the interestingness of them. Those processes iterated until the antecedent becomes empty
I'm not sure how the set of association rules helps in determining the best set of recommendations either, though. Perhaps I'm missing the point, and apriori is not intended for this use? In which case, what is it intended for?
回答1:
So the apriori algorithm is no longer the state of the art for Market Basket Analysis (aka Association Rule Mining). The techniques have improved, though the Apriori principle (that the support of a subset upper bounds the support of the set) is still a driving force.
In any case, the way association rules are used to generate recommendations is that, given some history itemset, we can check each rule's antecedant to see if is contained in the history. If so, then we can recommend the rule's consequent (eliminating cases where the consequent is already contained in the history, of course).
We can use various metrics to rank our recommendations, since with a multitude of rules we may have many hits when comparing them to a history, and we can only make a limited number of recommendations. Some useful metrics are the support of a rule (which is the same as the support of the union of the antecedant and the consequant), the confidence of a rule (the support of the rule over the support of the antecedant), and the lift of a rule (the support of the rule over the product of the support of the antecedant and the consequent), among others.
回答2:
If you want some details about how Apriori can be used for classification you coul read the paper about the CBA algorithm:
Bing Liu, Wynne Hsu, Yiming Ma, "Integrating Classification and Association Rule Mining." Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98, Plenary Presentation), New York, USA, 1998
来源:https://stackoverflow.com/questions/1255663/using-the-apriori-algorithm-for-recommendations