问题
This is my action_slippers Datalist. Please note that this just the part of it:
X_id cd iios ui w
1 56548c6ab65dd425cc3dda13 2015-11-24T16:12:26.572Z 194635691 563734c3b65dd40e340eaa56 0.010
2 56548df4b84c321fe4cdfb91 2015-11-24T16:19:00.798Z 194153563 56548df4b84c321fe4cdfb8f 0.010
3 56548fc7735e782a88591662 2015-11-24T16:26:46.952Z 177382028 563e12657d4c410c5832579c 0.010
4 565494e1b84c321fe4ce2f44 2015-11-24T16:48:33.828Z 177382031 563e12657d4c410c5832579c 0.010
5 5654994a735e782a88595802 2015-11-24T17:07:18.269Z 195129144 56549946735e782a885957e6 0.080
6 56549ce2b65dd425cc3e550c 2015-11-24T17:22:42.775Z 196972549 565181854c24b410e4891e11 0.010
7 56549f9bb84c321fe4ce7a3a 2015-11-24T17:34:19.732Z 194153563 56549f9bb84c321fe4ce7a37 0.010
8 5654a35a735e782a8859a055 2015-11-24T17:50:18.068Z 196258704 5654a35a735e782a8859a053 0.010
9 5654a5bab8e3a9227cffd593 2015-11-24T18:00:26.102Z 194907960 56320e0e55e89c3e14e26d3d 0.010
10 5654a7bb735e782a8859c495 2015-11-24T18:08:59.476Z 196950156 5651b53fec231f1df8482d23 0.027
iios :Unique id for the item. This is the field for relating items in these files to items files.
ui : Unique id for the user.
w : 0.01 means the user looked at the item. 0.08 means the user added the item to cart, 0.027 means a purchase.
What I want to do with this data list is build a function when purchased happened(w=0.027) that should order top 8 user which most similar to based on user who purchased a item with using cosine similarity formula(w=0.027)
I have tried these codes so far but ı could not get the vector of user who purchase for compare with the other users:
user_sim=function(i,actions_slippers){
for (i in 1:nrow(actions_slippers)) {
if (actions_slippers$w[i]==0.027) {
for (j in 1:i) {
user_id=actions_slippers$ui[i]
mydf <- data.frame(
ui = c(actions_slippers$ui[1:i]),
w = c(actions_slippers$w[1:i]),
iios = factor(
c(actions_slippers$iios[1:i]),
levels = unique(x)))
action= dcast(mydf, formula = ui ~ iios,
fill = 0, value.var = "w",
fun.aggregate = sum, drop = FALSE)
p=as.vector(cosine(t(a[,2:ncol(a)]))[,1])
t=a[,-1]
u_sim=t*p
col_sum=colSums(u_sim)
t_sum=colSums(t)
res1=col_sum/t_sum
newdata1=data.frame(res1)
max1 <- newdata1[order(res1[1:8],decreasing=TRUE),]
}
}
}
}
I used reshape2
package to get action dataframe that looks like below. My problem is I cant get the user vector who purchased and then need to calculate cosine similarity with each user between who purchased item.
Here in my code I calculate cosine similarity based on first row of actions data frame at below between each other rows but as I mention above I need to get vector of purchased user and calculate cosine similarity based on this vector between each other user
ui 194635691 194153563 177382028 177382031 195129144 196972549 196258704 194907960 196950156 194139014 153444738 192982501 192891196
1 237 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.01 0.000 0 0 0 0
2 261 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.000 0 0 0 0
3 290 0.00 0.00 0.01 0.01 0.00 0.00 0.00 0.00 0.000 0 0 0 0
4 483 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.000 0 0 0 0
5 485 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.027 0 0 0 0
6 533 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.000 0 0 0 0
7 534 0.00 0.00 0.00 0.00 0.01 0.00 0.00 0.00 0.000 0 0 0 0
8 535 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.000 0 0 0 0
9 536 0.00 0.00 0.00 0.00 0.00 0.00 0.01 0.00 0.000 0 0 0 0
Can anyone tell me What should I do? Many thanks.
来源:https://stackoverflow.com/questions/34407923/how-to-build-a-cosine-similarity-function-in-r