sql query to determine the most similar goods by tags

前端 未结 3 2098
余生分开走
余生分开走 2021-02-11 07:48

i\'m making an e-store, so i have 3 tables:

1) goods

id      | title
--------+----------- 
1       | Toy car
2       | Toy pony
3       | Do         


        
相关标签:
3条回答
  • 2021-02-11 08:33

    Some help:

    Assuming you are looking the most similar to goods#1

    SELECT a.*  
    FROM (SELECT * FROM goods WHERE id <> 1) a 
    LEFT JOIN (SELECT z.goods_id, count(*) as total
              FROM links z
              WHERE z.goods_id <> 1 AND
              z.tag_id in (SELECT DISTINCT tag_id from links where goods_id = 1)
              GROUP BY z.goods_id) b 
    ON a.id = b.goods_id
    ORDER by b.total DESC
    

    However, I think you can try something a bit different. Instead of ordering by the number of common tags, you can sort by the ratio of common tags. With this you will avoid the fact that the products with more tags will appear always at the top of the rankings, even if the relative common tags are not many.

    0 讨论(0)
  • 2021-02-11 08:37

    This query will return all items that have the maximum number of tags in common:

    SET @item = 1;
    
    SELECT
      goods_id
    FROM
      links
    WHERE
      tag_id IN (SELECT tag_id FROM links WHERE goods_id=@item)
      AND goods_id!=@item
    GROUP BY
      goods_id
    HAVING
      COUNT(*) = (
        SELECT
          COUNT(*)
        FROM
          links
        WHERE
          tag_id IN (SELECT tag_id FROM links WHERE goods_id=@item)
          AND goods_id!=@item
        GROUP BY
          goods_id
        ORDER BY
          COUNT(*) DESC
        LIMIT 1
      )
    

    Please see fiddle here.

    Or this one will return all items, even those with no tags in common, ordered by the number of tags in common desc:

    SELECT
      goods_id
    FROM
      links
    WHERE
      goods_id!=@item
    GROUP BY
      goods_id
    ORDER BY
      COUNT(CASE WHEN tag_id IN (SELECT tag_id FROM links WHERE goods_id=@item) THEN 1 END) DESC;
    
    0 讨论(0)
  • 2021-02-11 08:40

    When you want to show the goods with goods id = 2

    SELECT DISTINCT
      goods.*
    FROM
      goods
      LEFT JOIN links ON links.goods_id = goods.id
    WHERE links.tag_id IN (SELECT links.tag_id 
                           FROM links
                           WHERE links.goods_id = 2)
    

    when you did not include goods_id = 2

    SELECT DISTINCT
      goods.*
    FROM
      goods
      LEFT JOIN links ON links.goods_id = goods.id
    WHERE links.goods_id != 2 AND links.tag_id IN (SELECT links.tag_id 
                           FROM links
                           WHERE links.goods_id = 2)
    

    can see on http://sqlfiddle.com/#!2/0fb60/38

    0 讨论(0)
提交回复
热议问题