I want a random selection of rows in PostgreSQL, I tried this:
select * from table where random() < 0.01;
But some other recommend this:
The one with the ORDER BY is going to be the slower one.
select * from table where random() < 0.01; goes record by record, and decides to randomly filter it or not. This is going to be O(N) because it only needs to check each record once.
select * from table order by random() limit 1000; is going to sort the entire table, then pick the first 1000. Aside from any voodoo magic behind the scenes, the order by is O(N * log N).
The downside to the random() < 0.01 one is that you'll get a variable number of output records.
Note, there is a better way to shuffling a set of data than sorting by random: The Fisher-Yates Shuffle, which runs in O(N). Implementing the shuffle in SQL sounds like quite the challenge, though.