问题
I was reading relational algebra from one of the textbook. I came across DIVIDE operation. From Wikipedia:
The division is a binary operation that is written as R ÷ S. The result consists of the restrictions of tuples in R to the attribute names unique to R, i.e., in the header of R but not in the header of S, for which it holds that all their combinations with tuples in S are present in R.
Thus if R is:
+----+----+
| A | B |
+----+----+
| a1 | b1 |
| a2 | b1 |
| a3 | b1 |
| a4 | b1 |
| a1 | b2 |
| a3 | b2 |
| a2 | b3 |
| a3 | b3 |
| a4 | b3 |
| a1 | b4 |
| a2 | b4 |
| a3 | b4 |
+----+----+
and S is:
+----+
| A |
+----+
| a1 |
| a2 |
| a3 |
+----+
then the output should be:
+----+
| B |
+----+
| b1 |
| b4 |
+----+
Now the book gives relational equivalent steps to calculate DIVIDE operations as below. I am trying to immitate it exactly in SQL Server, but out of direction
T1 ← πBR //that is SELECT A FROM R T2 ← πB((S x T1) - R) //SxT1 can simply be done as SELECT * FROM S,T1 T ← T1 - T2
回答1:
You have an inaccuracy in the formula. In your particular case you need projection on B
instead of projection on A
otherwise how the B
attribute would appear in the result?
T1 ← πBR
T2 ← πB((S x T1) - R)
T ← T1 - T2
The query:
-- T ← T1 - T2 -------------------------------------
SELECT DISTINCT B FROM R -- T1 ← πBR --
WHERE B NOT IN ( --
-- T2 ← πB((S x T1) - R) ------------------- --
SELECT CR.B -- --
FROM ( -- --
-- (S x T1) -------------------- -- --
SELECT S.A, R2.B -- -- --
FROM S -- -- --
CROSS JOIN ( -- -- --
-- T1 ← πBR -- -- --
SELECT DISTINCT B FROM R -- -- --
) AS R2 -- -- --
-------------------------------- -- --
) AS CR -- --
WHERE NOT EXISTS ( -- --
SELECT * -- --
FROM R -- --
WHERE R.A = CR.A AND R.B = CR.B -- --
)------------------------------------------- --
)---------------------------------------------------
The result:
B
--
b1
b4
SQLFiddle
回答2:
There is a misprint in the relational algebra formulas. Instead of πA
there should be πB
.
Create sample tables:
DECLARE @R TABLE (A varchar(10), B varchar(10));
INSERT INTO @R (A, B) VALUES ('a1','b1');
INSERT INTO @R (A, B) VALUES ('a2','b1');
INSERT INTO @R (A, B) VALUES ('a3','b1');
INSERT INTO @R (A, B) VALUES ('a4','b1');
INSERT INTO @R (A, B) VALUES ('a1','b2');
INSERT INTO @R (A, B) VALUES ('a3','b2');
INSERT INTO @R (A, B) VALUES ('a2','b3');
INSERT INTO @R (A, B) VALUES ('a3','b3');
INSERT INTO @R (A, B) VALUES ('a4','b3');
INSERT INTO @R (A, B) VALUES ('a1','b4');
INSERT INTO @R (A, B) VALUES ('a2','b4');
INSERT INTO @R (A, B) VALUES ('a3','b4');
DECLARE @S TABLE (A varchar(10));
INSERT INTO @S (A) VALUES ('a1');
INSERT INTO @S (A) VALUES ('a2');
INSERT INTO @S (A) VALUES ('a3');
Literal translation of the given relational algebra formulas would be like this (tried in SQL Server 2008). The x
operator is CROSS JOIN
, the -
operator is EXCEPT.
WITH
T1
AS
(
SELECT B
FROM @R
)
,T2
AS
(
SELECT SS.A, T1.B
FROM
@S AS SS CROSS JOIN T1
EXCEPT
SELECT RR.A, RR.B
FROM @R AS RR
)
SELECT T1.B
FROM T1
EXCEPT
SELECT T2.B
FROM T2
;
Result set:
B
b1
b4
Interestingly, that execution plan for the variant that uses EXCEPT
and variant by Andrey Morozov are almost identical. I like SQL Server query optimizer! Top plan uses EXCEPT (see the query above), bottom plan is without EXCEPT (see the query in answer by @Andrey Morozov, I adapted it here for my temp tables):
SELECT DISTINCT B FROM @R -- T1 ← πBR --
WHERE B NOT IN ( --
-- T2 ← πB((S x T1) - R) ------------------- --
SELECT CR.B -- --
FROM ( -- --
-- (S x T1) -------------------- -- --
SELECT S.A, R2.B -- -- --
FROM @S AS S -- -- --
CROSS JOIN ( -- -- --
-- T1 ← πBR -- -- --
SELECT DISTINCT B FROM @R -- -- --
) AS R2 -- -- --
-------------------------------- -- --
) AS CR -- --
WHERE NOT EXISTS ( -- --
SELECT * -- --
FROM @R AS R -- --
WHERE R.A = CR.A AND R.B = CR.B -- --
)------------------------------------------- --
)-------------
来源:https://stackoverflow.com/questions/27884554/how-to-implement-relational-equivalent-of-the-divide-operation-in-sql-server