问题
The following image can be found by searching for SQL join on the internet:
Can we interpret this as a Venn diagram, in the following way?
- The circle labelled Table A is the set of records in Table A
- The circle labelled Table B is the set of records in Table B
- The intersection of the sets represents inner join (aka just join in SQL)
Motivation
The diagram shown, which appears in various forms widely on the net, looks awfully like a Venn diagram. So the first intuition is to think of it as a Venn diagram. However, one might get stuck trying to define the contents of the sets "Table A" and "Table B".
There is also information on the internet that appears to conflict. This question hopes to help resolve some of the conflicts that arise. Some sites argue that we shouldn't use Venn diagrams for explaining joins:
- https://towardsdatascience.com/can-we-stop-with-the-sql-joins-venn-diagrams-insanity-16791d9250c3
- https://blog.jooq.org/2016/07/05/say-no-to-venn-diagrams-when-explaining-joins/
While there is other information that uses Venn diagrams between tables A and B to explain joins on those tables e.g.
- What is the difference between "INNER JOIN" and "OUTER JOIN"?
- version of an answer to that question with Venn diagrams
回答1:
No. The diagram cannot be interpreted as a Venn diagram if it is to represent joins. Let's show why with a counter-example. Let's choose a minimal counterexample to avoid clutter. As such, let's assume both tables have only one column: ID
. Furthermore, let's assume that both of the tables have just two records in them, and they both have the same two records: just the IDs 1 and 2. So both tables just look like this:
Now, suppose we run the following SQL:
select * from A join B
Since there is no ON
clause, we will get back the table of all combinations of records in A
with all records in B
. Since there are 2 records in each, we will get back 4 = 2 * 2
records. Our resultant table from the select will look like this, with A and B's ID columns fully qualified by the table name:
However, the intersection of the two sets of records, as depicted in the Venn diagram, is not equal to this, no matter what interpretation we choose for records.
Interpretation #1: Each record is just a tuple of values. In this case, since we only have one column in each table, our tuples boil down to just single values. Since our ID column is numeric, this interpretation boils down to a record being just a numeric ID. The intersection of the sets is the intersection of the set {1, 2}
with itself. Which is just the original set. Or as a table, the intersection would be:
Interpretation #2: Each record from Table A is a thing completely distinct from any record in Table B, or indeed any other record in Table A itself. In this case, since no two records are the same thing, the intersection of the two sets will be empty i.e. the null set. So as a table it certainly would not be the same as the result of the join.
This concludes the proof that this exact interpretation of a Venn diagram cannot be considered to represent a SQL join.
来源:https://stackoverflow.com/questions/61685512/is-it-appropriate-to-use-venn-diagrams-to-depict-sql-joins-where-the-tables-are