MySQL select 10 random rows from 600K rows fast

后端 未结 26 2992
粉色の甜心
粉色の甜心 2020-11-21 05:06

How can I best write a query that selects 10 rows randomly from a total of 600k?

26条回答
  •  栀梦
    栀梦 (楼主)
    2020-11-21 05:43

    If you have just one Read-Request

    Combine the answer of @redsio with a temp-table (600K is not that much):

    DROP TEMPORARY TABLE IF EXISTS tmp_randorder;
    CREATE TABLE tmp_randorder (id int(11) not null auto_increment primary key, data_id int(11));
    INSERT INTO tmp_randorder (data_id) select id from datatable;
    

    And then take a version of @redsios Answer:

    SELECT dt.*
    FROM
           (SELECT (RAND() *
                         (SELECT MAX(id)
                            FROM tmp_randorder)) AS id)
            AS rnd
     INNER JOIN tmp_randorder rndo on rndo.id between rnd.id - 10 and rnd.id + 10
     INNER JOIN datatable AS dt on dt.id = rndo.data_id
     ORDER BY abs(rndo.id - rnd.id)
     LIMIT 1;
    

    If the table is big, you can sieve on the first part:

    INSERT INTO tmp_randorder (data_id) select id from datatable where rand() < 0.01;
    

    If you have many read-requests

    1. Version: You could keep the table tmp_randorder persistent, call it datatable_idlist. Recreate that table in certain intervals (day, hour), since it also will get holes. If your table gets really big, you could also refill holes

      select l.data_id as whole from datatable_idlist l left join datatable dt on dt.id = l.data_id where dt.id is null;

    2. Version: Give your Dataset a random_sortorder column either directly in datatable or in a persistent extra table datatable_sortorder. Index that column. Generate a Random-Value in your Application (I'll call it $rand).

      select l.*
      from datatable l 
      order by abs(random_sortorder - $rand) desc 
      limit 1;
      

    This solution discriminates the 'edge rows' with the highest and the lowest random_sortorder, so rearrange them in intervals (once a day).

提交回复
热议问题