Best way to select random rows PostgreSQL

前端 未结 12 994
借酒劲吻你
借酒劲吻你 2020-11-22 06:57

I want a random selection of rows in PostgreSQL, I tried this:

select * from table where random() < 0.01;

But some other recommend this:

12条回答
  •  无人及你
    2020-11-22 07:46

    The one with the ORDER BY is going to be the slower one.

    select * from table where random() < 0.01; goes record by record, and decides to randomly filter it or not. This is going to be O(N) because it only needs to check each record once.

    select * from table order by random() limit 1000; is going to sort the entire table, then pick the first 1000. Aside from any voodoo magic behind the scenes, the order by is O(N * log N).

    The downside to the random() < 0.01 one is that you'll get a variable number of output records.


    Note, there is a better way to shuffling a set of data than sorting by random: The Fisher-Yates Shuffle, which runs in O(N). Implementing the shuffle in SQL sounds like quite the challenge, though.

提交回复
热议问题