Select a random sample of results from a query result

限于喜欢 提交于 2019-11-27 06:20:32
SELECT  *
FROM    (
        SELECT  *
        FROM    mytable
        ORDER BY
                dbms_random.value
        )
WHERE rownum <= 1000
grokster

The SAMPLE clause will give you a random sample percentage of all rows in a table.

For example, here we obtain 25% of the rows:

SELECT * FROM emp SAMPLE(25)

The following SQL (using one of the analytical functions) will give you a random sample of a specific number of each occurrence of a particular value (similar to a GROUP BY) in a table.

Here we sample 10 of each:

SELECT * FROM (
SELECT job, sal, ROW_NUMBER()
OVER (
PARTITION BY job ORDER BY job
) SampleCount FROM emp
)
WHERE SampleCount <= 10

This in not a perfect answer but will get much better performance.

SELECT  *
FROM    (
    SELECT  *
    FROM    mytable sample (0.01)
    ORDER BY
            dbms_random.value
    )
WHERE rownum <= 1000

Sample will give you a percent of your actual table, if you really wanted a 1000 rows you would need to adjust that number. More often I just need an arbitrary number of rows anyway so I don't limit my results. On my database with 2 million rows I get 2 seconds vs 60 seconds.

select * from mytable sample (0.01)
SELECT * FROM TABLE_NAME SAMPLE(1)

Will give you olny an approximate 1% share rather than exactly 1/100 of the number of observations. The likely reason is than Oracle generates a random flag for each observation as to whether include in in the sample that it generates. The argument 1 (1%) in such a generation process takes the role of probability of each observation's being selected into the sample.

If this is true, the actual distribution of sample sizes will be binomial.

Ankit Bajpai

Sample function is used for sample data in ORACLE. So you can try like this:-

SELECT * FROM TABLE_NAME SAMPLE(50);

Here 50 is the percentage of data contained by the table. So if you want 1000 rows from 100000. You can execute a query like:

SELECT * FROM TABLE_NAME SAMPLE(1);

Hope this can help you.

I know this has already been answered, but seeing so many visits here I'd like to add one version that uses the SAMPLE clause but still allows to filter the rows first:

with cte1 as (
    select *
    from t_your_table
    where your_column = 'ABC'
)
select * from cte1 sample (5)

Note however that the base select needs a ROWID column, which means it may not work for some views for example.

BASMA SHAWKY

Something like this should work:

SELECT * 
FROM table_name
WHERE primary_key IN (SELECT primary_key 
                      FROM
                      (
                        SELECT primary_key, SYS.DBMS_RANDOM.RANDOM 
                        FROM table_name 
                        ORDER BY 2
                      )
                      WHERE rownum <= 10 );
ChrisNZak

We were given and assignment to select only two records from the list of agents..i.e 2 random records for each agent over the span of a week etc.... and below is what we got and it works

with summary as (
Select Dbms_Random.Random As Ran_Number,
             colmn1,
             colm2,
             colm3
             Row_Number() Over(Partition By col2 Order By Dbms_Random.Random) As Rank
    From table1, table2
 Where Table1.Id = Table2.Id
 Order By Dbms_Random.Random Asc)
Select tab1.col2,
             tab1.col4,
             tab1.col5,
    From Summary s
 Where s.Rank <= 2;
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!