How to request a random row in SQL?

前端 未结 29 2787
孤城傲影
孤城傲影 2020-11-21 06:45

How can I request a random row (or as close to truly random as is possible) in pure SQL?

相关标签:
29条回答
  • 2020-11-21 07:31

    Random function from the sql could help. Also if you would like to limit to just one row, just add that in the end.

    SELECT column FROM table
    ORDER BY RAND()
    LIMIT 1
    
    0 讨论(0)
  • 2020-11-21 07:32
    ORDER BY NEWID()
    

    takes 7.4 milliseconds

    WHERE num_value >= RAND() * (SELECT MAX(num_value) FROM table)
    

    takes 0.0065 milliseconds!

    I will definitely go with latter method.

    0 讨论(0)
  • 2020-11-21 07:32

    For SQL Server 2005 and 2008, if we want a random sample of individual rows (from Books Online):

    SELECT * FROM Sales.SalesOrderDetail
    WHERE 0.01 >= CAST(CHECKSUM(NEWID(), SalesOrderID) & 0x7fffffff AS float)
    / CAST (0x7fffffff AS int)
    
    0 讨论(0)
  • 2020-11-21 07:33

    Solutions like Jeremies:

    SELECT * FROM table ORDER BY RAND() LIMIT 1
    

    work, but they need a sequential scan of all the table (because the random value associated with each row needs to be calculated - so that the smallest one can be determined), which can be quite slow for even medium sized tables. My recommendation would be to use some kind of indexed numeric column (many tables have these as their primary keys), and then write something like:

    SELECT * FROM table WHERE num_value >= RAND() * 
        ( SELECT MAX (num_value ) FROM table ) 
    ORDER BY num_value LIMIT 1
    

    This works in logarithmic time, regardless of the table size, if num_value is indexed. One caveat: this assumes that num_value is equally distributed in the range 0..MAX(num_value). If your dataset strongly deviates from this assumption, you will get skewed results (some rows will appear more often than others).

    0 讨论(0)
  • 2020-11-21 07:33

    Be careful because TableSample doesn't actually return a random sample of rows. It directs your query to look at a random sample of the 8KB pages that make up your row. Then, your query is executed against the data contained in these pages. Because of how data may be grouped on these pages (insertion order, etc), this could lead to data that isn't actually a random sample.

    See: http://www.mssqltips.com/tip.asp?tip=1308

    This MSDN page for TableSample includes an example of how to generate an actualy random sample of data.

    http://msdn.microsoft.com/en-us/library/ms189108.aspx

    0 讨论(0)
提交回复
热议问题