I have table which is having about 1000 rows.I have to update a column(\"X\") in the table to \'Y\' for n ramdom rows. For this i can have following query
update
I would use the ROWID:
UPDATE xyz SET x='Y' WHERE rowid IN (
SELECT r FROM (
SELECT ROWID r FROM xyz ORDER BY dbms_random.value
) RNDM WHERE rownum < n+1
)
The actual reason I would use ROWID isn't for efficiency though (it will still do a full table scan) - your SQL may not update the number of rows you want if column m
isn't unique.
With only 1000 rows, you shouldn't really be worried about efficiency (maybe with a hundred million rows). Without any index on this table, you're stuck doing a full table scan to select random records.
[EDIT:] "But what if there are 100,000 rows"
Well, that's still 3 orders of magnitude less than 100 million.
I ran the following:
create table xyz as select * from all_objects;
[created about 50,000 rows on my system - non-indexed, just like your table]
UPDATE xyz SET owner='Y' WHERE rowid IN (
SELECT r FROM (
SELECT ROWID r FROM xyz ORDER BY dbms_random.value
) RNDM WHERE rownum < 10000
);
commit;
This took approximately 1.5 seconds. Maybe it was 1 second, maybe up to 3 seconds (didn't formally time it, it just took about enough time to blink).
You can improve performance by replacing the full table scan with a sample.
The first problem you run into is that you can't use SAMPLE in a DML subquery, ORA-30560: SAMPLE clause not allowed
. But logically this is what is needed:
UPDATE xyz SET x='Y' WHERE rowid IN (
SELECT r FROM (
SELECT ROWID r FROM xyz sample(0.15) ORDER BY dbms_random.value
) RNDM WHERE rownum < 100/*n*/+1
);
You can get around this by using a collection to store the rowids, and then update the rows using the rowid collection. Normally breaking a query into separate parts and gluing them together with PL/SQL leads to horrible performance. But in this case you can still save a lot of time by significantly reducing the amount of data read.
declare
type rowid_nt is table of rowid;
rowids rowid_nt;
begin
--Get the rowids
SELECT r bulk collect into rowids
FROM (
SELECT ROWID r
FROM xyz sample(0.15)
ORDER BY dbms_random.value
) RNDM WHERE rownum < 100/*n*/+1;
--update the table
forall i in 1 .. rowids.count
update xyz set x = 'Y'
where rowid = rowids(i);
end;
/
I ran a simple test with 100,000 rows (on a table with only two columns), and N = 100. The original version took 0.85 seconds, @Gerrat's answer took 0.7 seconds, and the PL/SQL version took 0.015 seconds.
But that's only one scenario, I don't have enough information to say my answer will always be better. As N increases the sampling advantage is lost, and the writing will be more significant than the reading. If you have a very small amount of data, the PL/SQL context switching overhead in my answer may make it slower than @Gerrat's solution.
For performance issues, the size of the table in bytes is usually much more important than the size in rows. 1000 rows that use a terabyte of space is much larger than 100 million rows that only use a gigabyte.
Here are some problems to consider with my answer:
N
change, you'll need to use dynamic SQL to change the percent.The following solution works just fine. It's performant and seems to be similar to sample()
:
create table t1 as
select level id, cast ('item'||level as varchar2(32)) item
from dual connect by level<=100000;
Table T1 created.
update t1 set item='*'||item
where exists (
select rnd from (
select dbms_random.value() rnd
from t1
) t2 where t2.rowid = t1.rowid and rnd < 0.15
);
14,858 rows updated.
Elapsed: 00:00:00.717
Consider that alias rnd
must be included in select clause. Otherwise changes the omptimizer the filter predicat from RND<0.1
to DBMS_RANDOM.VALUE()<0.1
. In that case dbms_random.value
will be executed only once.
As mentioned in answer @JonHeller, the best solution remains the pl/sql code block because it allows to avoid full table scan. Here is my suggestion:
create or replace type rowidListType is table of varchar(18);
/
create or replace procedure updateRandomly (prefix varchar2 := '*') is
rowidList rowidListType;
begin
select rowidtochar (rowid) bulk collect into rowidList
from t1 sample(15)
;
update t1 set item=prefix||item
where exists (
select 1 from table (rowidList) t2
where chartorowid(t2.column_value) = t1.rowid
);
dbms_output.put_line ('updated '||sql%rowcount||' rows.');
end;
/
begin updateRandomly; end;
/
Elapsed: 00:00:00.293
updated 14892 rows.