Mixing different categories results, ordered by score in MySQL

后端 未结 3 1364
栀梦
栀梦 2021-01-14 11:53

In my PHP application, I have a mysql table of articles which has the following columns:

article_id    articletext    category_id    score

相关标签:
3条回答
  • 2021-01-14 11:56

    Just for learning purpose. I made a test with 3 categories. I have no idea how this query could run on a large recordset.

    select * from (
    (select @r:=@r+1 as rownum,article_id,articletext,category_id,score
    from articles,(select @r:=0) as r
    where category_id = 1
    order by score desc limit 100000000) 
    union all
    (select @r1:=@r1+1,article_id,articletext,category_id,score
    from articles,(select @r1:=0) as r
    where category_id = 2
    order by score desc limit 100000000)
    union all
    (select @r2:=@r2+1,article_id,articletext,category_id,score
    from articles,(select @r2:=0) as r
    where category_id = 3
    order by score desc limit 100000000)
    ) as t
    order by rownum,score desc
    
    0 讨论(0)
  • 2021-01-14 12:11

    Your naive solution is exactly what I would do.

    0 讨论(0)
  • 2021-01-14 12:18

    Go get the top 20. If they don't satisfy the requirements, do an additional query to get the missing pieces. You should be able to come up with some balance between number of queries and number of rows each returns.

    I you got the top 100 it might satisfy the requirements 90% of the time and would be cheaper and faster than 10 separate queries.

    If it was SQL Server I could help more...

    Actually, I have another idea. Run a process every 5 minutes that calculates the list and caches it in a table. Make DML against related tables invalidate the cache so it is not used until repopulated (perhaps an article was deleted). If the cache is invalid, you would fall back to calculating it on the fly... And could use that to repopulate the cache anyway.

    It might be possible to strategically update the cached list rather than recalculate it. But that could be a real challenge.

    This should help both with query speed and reducing load on your database. It shouldn't matter much if your article list is 5 minutes out of date. Heck, even 1 minute might work.

    0 讨论(0)
提交回复
热议问题