问题
I would like to mine specific rhs rules. There is an example in the documentation which demonstrates that this is possible, but only for a specific case (as we see below). First an data set to illustrate my problem:
input <- matrix( c( rep(10001,6) , rep(10002,3) , rep(10003,3), 100001,100002,100003,100004,100005,100006,100002,100003,100007,100002,100003,100008,rep('a',6),rep('b',6)), ncol=3)
colnames(input) <- c(letters[1:3])
input <- as.data.frame(input)
Now i can create rules:
r <- apriori(input)
To see the rules:
inspect(r)
I would like to only mine rules that have b=... on the rhs. For specific values this can be done by adding:
appearance = list(rhs = c("b=100001", "b=100002"),default="lhs")
to the apriori command. I will also have to adjust the confidence if i want to find them ofcourse. The problem lies in the number of elements in column b. I can manualy type all the elements in the "b=....." format in this example, but I can't in my own data.
I tried to get the values of b using unique() and then giving that to the rhs, but it will generate an error because i give values like: "100001" "100002" instead of "b=100001" "b=100002".
Is there a was to only get rhs rules from a specific column?
If not, is there an easy way to generate 'want' from 'current?
current <- c("100001", "100002", "100003", "100004", "100005", "100006", "100007", "100008")
want <- c("b=100001", "b=100002", "b=100003", "b=100004", "b=100005", "b=100006", "b=100007", "b=100008")
Somewhat related is this question: Creating specific rules with arules in r But that has the same problem for me, only a different way.
回答1:
You can use subset
:
r <- apriori(input, parameter = list(support = 0.1, confidence = 0.1))
inspect( subset( r, subset = rhs %pin% "b=" ) )
# lhs rhs support confidence lift
# 1 {} => {b=100002} 0.2500000 0.2500000 1.000000
# 2 {} => {b=100003} 0.2500000 0.2500000 1.000000
# 3 {c=b} => {b=100002} 0.1666667 0.3333333 1.333333
# 4 {c=b} => {b=100003} 0.1666667 0.3333333 1.333333
For you second question, you can use paste
:
paste0( "b=", current )
# [1] "b=100001" "b=100002" "b=100003" "b=100004" "b=100005" "b=100006" "b=100007"
# [8] "b=100008"
回答2:
The arules
documentation now has an example that does exactly what you want:
bItems <- grep("^b=", itemLabels(input), value = TRUE)
rules <- apriori(input, parameter = list(support = 0.1, confidence = 0.1),
appearance = list(rhs = bItems))
I haven't actually tested this with your example code (the arules
documentation example uses a transactions
object, not a data.frame
), but grep-ing those column labels should work out.
来源:https://stackoverflow.com/questions/18314800/r-arules-mine-only-rules-from-specific-column