问题
I'm having a hard time to get to the right query/queries here. When you want to find related items using tags, in MySQL, you can use a 'common tag count' to find items that are most similar.
Say my schema looks like this:
- tags(tag_id, title)
- articles(article_id, some_text)
- articles_tags(tag_id, article_id)
Then you can get items and sort them on common tags with 'article 2' for example, like this:
SELECT at1.article_id, Count(at1.tag_id) AS common_tag_count
FROM articles_tags AS at1 INNER JOIN articles_tags AS at2 ON at1.tag_id = at2.tag_id
WHERE at2.article_id = 2
GROUP BY at1.article_id
HAVING at1.article_id != 2
ORDER BY common_tag_count DESC;
But in my situation, there's a challenge. I want to find similar articles based on multiple articles instead of one (something like a "read history"). And if 2 articles both have tag X, I want tag X to become more important.
So basicly, I'm looking for a way to do a common_tag_count match but with a weight for tags. Anyone has any idea how to accomplish this?
回答1:
To get the tags used by the multiple articles, including how often they are used, you can use this query:
SELECT tag_id, COUNT(article_id) as tag_weight
FROM articles_tags
WHERE article_id IN ( /* Read articles */ 1, 2 )
GROUP BY tag_id;
To get the similar articles based on that selection you have to use above query in a similar join as you already have:
SELECT articles.article_id, articles.title, SUM(tag_weights.tag_weight)
FROM articles
JOIN articles_tags ON articles_tags.article_id = articles.article_id
JOIN (
SELECT tag_id, COUNT(article_id) as tag_weight
FROM articles_tags
WHERE article_id IN ( /* Read articles */ 1, 2 )
GROUP BY tag_id
) AS tag_weights ON articles_tags.tag_id = tag_weights.tag_id
WHERE articles.article_id NOT IN ( /* Read articles */ 1, 2 )
GROUP BY articles.article_id
ORDER BY SUM(tag_weights.tag_weight) DESC;
We're adding an extra JOIN here on the subquery which has access to the tag-weights. Using the ORDER BY
you get the 'best' results first.
Demo: http://www.sqlfiddle.com/#!2/b35432/2/1 (articles 1 and 2 are read, giving tag 1 a weight of 2, tag 2 a weight of 1).
来源:https://stackoverflow.com/questions/25188948/sorting-items-on-matching-tags-that-have-a-weight-in-mysql