问题
Is there a way I can improve this kind of SQL query performance:
INSERT
INTO ...
WHERE NOT EXISTS(Validation...)
The problem is when I have many data in my table (like million of rows), the execution of the WHERE NOT EXISTS
clause if very slow. I have to do this verification because I can't insert duplicated data.
I use SQLServer 2005
thx
回答1:
Make sure you are searching on indexed columns, with no manipulation of the data within those columns (like substring etc.)
回答2:
Off the top of my head, you could try something like:
TRUNCATE temptable
INSERT INTO temptable ...
INSERT INTO temptable ...
...
INSERT INTO realtable
SELECT temptable.* FROM temptable
LEFT JOIN realtable on realtable.key = temptable.key
WHERE realtable.key is null
回答3:
Try to replace the NOT EXISTS with a left outer join, it sometimes performs better in large data sets.
回答4:
Pay attention to the other answer regarding indexing. NOT EXISTS is typically quite fast if you have good indexes.
But I have had performance issues with statements like you describe. One method I've used to get around that is to use a temp table for the candidate values, perform a DELETE FROM ... WHERE EXISTS (...), and then blindly INSERT the remainder. Inside a transaction, of course, to avoid race conditions. Splitting up the queries sometimes allows the optimizer to do its job without getting confused.
回答5:
If you can at all reduce your problem space, then you'll gain heaps of performance. Are you absolutely sure that every one of those rows in that table needs to be checked?
The other thing you might want to try is a DELETE InsertTable FROM InsertTable INNER JOIN ExistingTable ON <Validation criteria>
before your insert. However, your mileage may vary
回答6:
insert into customers
select *
from newcustomers
where customerid not in (select customerid
from customers)
..may be more efficient. As others have said, make sure you've got indexes on any lookup fields.
回答7:
Outer Apply tends to work for me...
instead of:
from t1
where not exists (select 1 from t2 where t1.something=t2.something)
I'll use:
from t1
outer apply (
select top 1 1 as found from t2 where t1.something=t2.something
) t2f
where t2f.found is null
来源:https://stackoverflow.com/questions/554509/sql-improve-not-exists-query-performance