问题
This question comes after a friend's comment. He said that when a query has a lot of subqueries, it's a signal that the database has design flaws and they must be avoided. He also said that many books suggest the same.
I agree in parts, but I think that are queries that have complex logic that a lot of subqueries are needed, or, to avoid the subqueries, a materialized view of a query or a lot of data redundancy.
So, what is the truth about subqueries? Must they always be avoided? No problems with them? Do they indicate database design flaws? Is it possible to have a database design that allows complex queries without having data redundancy?
回答1:
No, the presence of subqueries does not necessarily mean a database schema is poorly designed.
Correlated subqueries should be used sparingly (i.e. when an inner condition refers to an outer clause).
Other than that, subqueries are often a useful and a natural way of solving a problem. I tend to use joins rather than subqueries where possible.
Many query optimisers will transform certain types of subqueries into joins.
回答2:
Your friend's logic is flawed.
Although SQL and its various implementations is based, somewhat loosely, on the relational model, it lacks keywords or shorthands for many basic relation operators, notably semi join, semi difference (a.k.a. anti join) and divide. I often write semi join and semi difference in SQL code using subqueries; as for divide, I'm not sure it is possible to perform in a single query without using subqueries!
So my use of subqueries is determined by the questionable design of the SQL language, rather than the design of the database I'm using.
p.s. I wonder if you and/or your friend is using the term "database" to mean both database (the collection of data) and DBMS (the software system managing the data) interchangeably. If so and in context you mean DBMS then the statement "when a query has a lot of subqueries, it's a 'smell' that the DBMS has design flaws" may indeed be true.
回答3:
I tend to agree with your friend, if you have a frequent need for subqueries, it is a sign that the database hasn't been organized in a way that makes it easy to query. It may be perfect with respect to normalization rules but inconvenient with respect to common questions about the data. If so, the solution is often creating a view or intermediate table that brings to together data in a more searchable way.
I also agree with Mitch Wheat that subqueries are frequently useful. That question of their usefulness is orthogonal to the question of how best to organize data to make it easily queryable.
回答4:
A "correlated subquery" (i.e., one in which the where condition depends on values obtained from the rows of the containing query) will execute once for each row. A non-correlated subquery (one in which the where condition is independent of the containing query) will execute once at the beginning. The SQL engine makes this distinction automatically.
The subquery was probably executing a "full table scan". In other words, not using the index and returning way too many rows that the Where from the main query were needing to filter out.
Usually its the result of the optimizer not being able to figure out that the subquery can be executed as a join in which case it executes the subquery for each record in the table rather then join the table in the subquery against the table you are querying. Some of the more "enterprisey" database are better at this, but they still miss it sometimes.
So prefer joins over subqueries to get results faster and acurate.
来源:https://stackoverflow.com/questions/8276178/are-subqueries-evil