问题
I set out to rewrite a complex SQL statement with joins and sub-statements and obtained a more simple looking statement. I tested it by running both on the same data set and getting the same result set. In general, how can I (conceptually) prove that the 2 statements are the same in any given data set?
回答1:
I would suggest studying relational algebra (as pointed out by Mchl). It is the most essential concept you need if you want to get serious about optimizing queries and designing databases properly.
However I will suggest an ugly brute force approach that helps you to ensure correct results if you have sufficient data to test with: create views of both versions (for making the comparisons easier to manage) and compare the results. I mean something like
create view original as select xxx yyy zzz;
create view new as select xxx yyy zzz;
-- If the amount differs something is quite obviously very wrong
select count(*) from original;
select count(*) from new;
-- What is missing from the new one?
select *
from original o
where not exists (
select *
from new n
where o.col1=n.col2 and o.col2=n.col2 --and so on
);
-- Did something extra appear?
select *
from new o
where not exists (
select *
from old n
where o.col1=n.col2 and o.col2=n.col2 --and so on
)
Also as pointed out by others in comments you might feed both the queries to the optimizers of the product you are working with. Most of the time you get something that can be parsed with humans, complete drawings of the execution paths with the subqueries' impact on the performance etc. It is most often done with something like
explain plan for
select *
from ...
where ...
etc
来源:https://stackoverflow.com/questions/4868364/how-to-prove-2-sql-statements-are-equivalent