Given the following tables:
------------ ------------
| BA | | CA |
------+----- ------+-----
| BId|| AId| | CId|| AId|
One way to do this is to use JOIN
and COUNT
. You first count the CIds (how many times each one of them is repeated). Then, you do the same for BIds. Then, you JOIN BA and link both (AIds) and COUNT. Your target is to match the count of ids and their AIds.
Example :
DECLARE
@a TABLE(id INT)
INSERT INTO @a VALUES
(1),
(2),
(3),
(4),
(5),
(6)
DECLARE
@b TABLE(id CHAR(2))
INSERT INTO @b VALUES
('B1'),
('B2'),
('B3')
DECLARE
@c TABLE(id CHAR(2))
INSERT INTO @c VALUES
('C1'),
('C2'),
('C3')
DECLARE
@ba TABLE(BId CHAR(2), AId INT)
INSERT INTO @ba VALUES
('B1',2),
('B1', 3),
('B2', 2),
('B2', 4),
('B2', 5)
DECLARE
@ca TABLE(CId CHAR(2), AId INT)
INSERT INTO @ca VALUES
('C1',2),
('C2',2),
('C2',3),
('C3',4),
('C3',5)
SELECT DISTINCT CId
FROM (
SELECT *
, COUNT(*) OVER(PARTITION BY CId) cnt
FROM @ca ca
) c
LEFT JOIN (
SELECT *
, COUNT(*) OVER(PARTITION BY BId) cnt
FROM @ba ba
) b ON b.AId = c.AId AND b.cnt = c.cnt
WHERE
b.cnt IS NOT NULL
So, in the example C2 has repeated 2 times, and in BA, B1 has repeated also 2 times. This is the first condition, second one is to match both AIds, if they're the same, then you have a group match.
I think the simplest solution uses window functions:
select ca.cid, ba.bid
from (select ca.*, count(*) over (partition by cid) as cnt
from ca
) ca join
(select ba.*, count(*) over (partition by bid) as cnt
from ba
) ba
on ca.aid = ba.aid and ca.cnt = ba.cnt
group by ca.cid, ba.bid, ca.cnt
having ca.cnt = count(*) -- all match
Here is a db<>fiddle.
The result set is all matching cid
/bid
pairs.
The logic here is pretty simple. For each cid
and bid
, the subqueries calculate the count of aid
s. This number has to match.
Then the join
is on aid
-- this is an inner join, so only matching pairs are produced. The final group by
is used to generate the count of matches to see if this tallies up with all the aid
s.
This particular version assumes that the rows are unique in each table, although the query can easily be adjusted if this is not the case.
The number of elements can be calculated using a CTE for CA
and BA
. Then you can get at the full rows via:
with ca_info as (
select
cid
, count(*) as ccount
from ca
group by cid
),
ba_info as (
select
bid
, count(*) as bcount
from ba
group by bid
)
select
*
from
ba
join ca on (ba.aid = ca.aid)
join ba_info on ba.bid=ba_info.bid
join ca_info on ca.cid=ca_info.cid
where ccount = bcount
SQL Fiddle
Resultsbid aid cid aid bid bcount cid ccount
B1 2 C2 2 B1 2 C2 2
B1 3 C2 3 B1 2 C2 2
If you are just interested in C2
itself, you can restrict the result set more:
with ca_info as (
select
cid
, count(*) as ccount
from ca
group by cid
),
ba_info as (
select
bid
, count(*) as bcount
from ba
group by bid
)
select
distinct ca.cid
from
ba
join ca on (ba.aid = ca.aid)
join ba_info on ba.bid=ba_info.bid
join ca_info on ca.cid=ca_info.cid
where ccount = bcount
To also get sub/supersets, the condition enforcing the set-equality can be changed:
where ccount <= bcount
This returns all sets where Bx as at least as many elements as Cy:
bid aid cid aid bid bcount cid ccount
B1 2 C1 2 B1 2 C1 1
B1 2 C2 2 B1 2 C2 2
B1 3 C2 3 B1 2 C2 2
B2 2 C1 2 B2 3 C1 1
B2 2 C2 2 B2 3 C2 2
B2 4 C3 4 B2 3 C3 2
B2 5 C3 5 B2 3 C3 2