How can I request a random row (or as close to truly random as is possible) in pure SQL?
Most of the solutions here aim to avoid sorting, but they still need to make a sequential scan over a table.
There is also a way to avoid the sequential scan by switching to index scan. If you know the index value of your random row you can get the result almost instantially. The problem is - how to guess an index value.
The following solution works on PostgreSQL 8.4:
explain analyze select * from cms_refs where rec_id in
(select (random()*(select last_value from cms_refs_rec_id_seq))::bigint
from generate_series(1,10))
limit 1;
I above solution you guess 10 various random index values from range 0 .. [last value of id].
The number 10 is arbitrary - you may use 100 or 1000 as it (amazingly) doesn't have a big impact on the response time.
There is also one problem - if you have sparse ids you might miss. The solution is to have a backup plan :) In this case an pure old order by random() query. When combined id looks like this:
explain analyze select * from cms_refs where rec_id in
(select (random()*(select last_value from cms_refs_rec_id_seq))::bigint
from generate_series(1,10))
union all (select * from cms_refs order by random() limit 1)
limit 1;
Not the union ALL clause. In this case if the first part returns any data the second one is NEVER executed!