How can I compare a group of tags to another post\'s tags in my database to get related posts?
What I\'m trying to do is compare a group of tags
Put the tags into an array. Each array being respectively called Post A / Post B etc.
Then use array_diff_assoc()
, to figure out how different the arrays are.
But really, Ivars solution would work better, this is easier to understand though :)
NOTE: This solution is MySQL only, as MySQL has its own interpretation of GROUP BY
I've also used my own calculation of similarity. I've taken the number of identical tags and divided it by the average tag count in post A and post B. So if post A has 4 tags, and post B has 2 tags which are both shared with A, the similarity is 66%.
(SHARED:2 / ((A:4 + B:2)/2)
or (SHARED:2) / (AVG:3)
It should be easy to change the formula if you want/need to...
SELECT
sourcePost.id,
targetPost.id,
/* COUNT NUMBER OF IDENTICAL TAGS */
/* REF GROUPING OF sourcePost.id and targetPost.id BELOW */
COUNT(targetPost.id) /
(
(
/* TOTAL TAGS IN SOURCE POST */
(SELECT COUNT(*) FROM post_tags WHERE post_id = sourcePost.id)
+
/* TOTAL TAGS IN TARGET POST */
(SELECT COUNT(*) FROM post_tags WHERE post_id = targetPost.id)
) / 2 /* AVERAGE TAGS IN SOURCE + TARGET */
) as similarity
FROM
posts sourcePost
LEFT JOIN
post_tags sourcePostTags ON (sourcePost.id = sourcePostTags.post_id)
INNER JOIN
post_tags targetPostTags ON (sourcePostTags.tag_id = targetPostTags.tag_id
AND
sourcePostTags.post_id != targetPostTags.post_id)
LEFT JOIN
posts targetPost ON (targetPostTags.post_id = targetPost.id)
GROUP BY
sourcePost.id, targetPost.id