Suppose I have a table of election data, call it ELECTIONS, with one row per voter per election, like so:
VoterID ElectionID
A 1
A 2
B
I think this should work but I'm not positive... (can't remember if COUNT
can be used in a join like this). Let me know?
SELECT COUNT(*)
FROM ELECTIONS e1, ELECTIONS e2
WHERE e1.VoterID = e2.VoterID
AND e1.ElectionID = 1
AND e2.ElectionID = 2;
This is as simple as the following
SELECT voterid, COUNT(DISTINCT electionid) AS electioncount
FROM table
WHERE electionid IN (1, 2) /* substitute elections you are interested in here */
GROUP BY voterid
HAVING electioncount = 2 /* substiture number of election listed in where condition above
The result set size would provide the number of voters that meet your criteria (i.e. there is no reason to aggregate down furter (i.e. like with subselect) to get to that data.
I would recommend something more like this:
SELECT COUNT(*) AS NumVoters
FROM ELECTIONS e1
WHERE e1.ElectionID = 1
AND e1.VoterID in (
SELECT e2.VoterID
FROM ELECTIONS e2
WHERE e2.ElectionID = 2
);
That way you solve the problem, and have only 1 subquery.
Yes. what you have should work. (You will need to add an alias on the derived table, the error messsage you get should be self explanatory. Easy to fix, just add a space and the letter c (or whatever name you want) at the end of your query.
There's one caveat regarding the potential for duplicate (VoterID, ElectionID)
tuples.
If you have a unique constraint on (VoterID, ElectionID), then your query will work fine.
If you don't have a unique constraint (which disallows duplicate (VoterID, ElectionId)
), then there's a potential for a voter with two (2) rows for ElectionID 1, and no rows for ElectionID 2... for that voter to get included in the count. And a voter that voted twice in ElectionID 1 and only once in ElectionID 2, that voter will be excluded from the count.
Including the DISTINCT keyword inside a COUNT would fix that problem, e.g.
HAVING COUNT(DISTINCT ElectionID) = 2
I'd write the query differently, but what you have will work.
To get the count of VoterID that participated in both ElectionID 1 and ElectionID2, for improved performance, I'd avoid using an inline view (MySQL calls it a derived table). I'd have the query use a JOIN operation instead. Something like this:
SELECT COUNT(DISTINCT e1.voterID) AS NumVoters
FROM elections e1
JOIN elections e2
ON e2.voterID = e1.voterID
WHERE e1.electionID = 1
AND e2.electionID = 2
If you are guaranteed that (voterID, ElectionID)
is unique, then the select could be simpler:
SELECT COUNT(1) AS NumVoters
FROM elections e1
JOIN elections e2
ON e2.voterID = e1.voterID
WHERE e1.electionID = 1
AND e2.electionID = 2
SELECT COUNT(*)
FROM
( SELECT voterid
FROM votes
WHERE electionid IN(1,2)
GROUP
BY voterid
HAVING COUNT(*) = 2
) x;
This assumes that you have a UNIQUE or PRIMARY KEY formed on (voterid,electionid)