SQL query to count number of times certain values occur in multiple rows

后端 未结 5 1120
慢半拍i
慢半拍i 2021-01-23 09:32

Suppose I have a table of election data, call it ELECTIONS, with one row per voter per election, like so:

VoterID ElectionID
A           1
A           2
B                


        
相关标签:
5条回答
  • 2021-01-23 10:02

    I think this should work but I'm not positive... (can't remember if COUNT can be used in a join like this). Let me know?

    SELECT COUNT(*)
    FROM ELECTIONS e1, ELECTIONS e2
    WHERE e1.VoterID = e2.VoterID
        AND e1.ElectionID = 1
        AND e2.ElectionID = 2;
    
    0 讨论(0)
  • 2021-01-23 10:02

    This is as simple as the following

    SELECT voterid, COUNT(DISTINCT electionid) AS electioncount
    FROM table
    WHERE electionid IN (1, 2) /* substitute elections you are interested in here */
    GROUP BY voterid
    HAVING electioncount = 2 /* substiture number of election listed in where condition above
    

    The result set size would provide the number of voters that meet your criteria (i.e. there is no reason to aggregate down furter (i.e. like with subselect) to get to that data.

    0 讨论(0)
  • 2021-01-23 10:03

    I would recommend something more like this:

    SELECT COUNT(*) AS NumVoters
    FROM ELECTIONS e1
    WHERE e1.ElectionID = 1
    AND e1.VoterID in (
        SELECT e2.VoterID
        FROM ELECTIONS e2
        WHERE e2.ElectionID = 2
    );
    

    That way you solve the problem, and have only 1 subquery.

    0 讨论(0)
  • 2021-01-23 10:06

    Yes. what you have should work. (You will need to add an alias on the derived table, the error messsage you get should be self explanatory. Easy to fix, just add a space and the letter c (or whatever name you want) at the end of your query.

    There's one caveat regarding the potential for duplicate (VoterID, ElectionID) tuples.

    If you have a unique constraint on (VoterID, ElectionID), then your query will work fine.

    If you don't have a unique constraint (which disallows duplicate (VoterID, ElectionId)), then there's a potential for a voter with two (2) rows for ElectionID 1, and no rows for ElectionID 2... for that voter to get included in the count. And a voter that voted twice in ElectionID 1 and only once in ElectionID 2, that voter will be excluded from the count.

    Including the DISTINCT keyword inside a COUNT would fix that problem, e.g.

    HAVING COUNT(DISTINCT ElectionID) = 2
    

    I'd write the query differently, but what you have will work.

    To get the count of VoterID that participated in both ElectionID 1 and ElectionID2, for improved performance, I'd avoid using an inline view (MySQL calls it a derived table). I'd have the query use a JOIN operation instead. Something like this:

    SELECT COUNT(DISTINCT e1.voterID) AS NumVoters
      FROM elections e1
      JOIN elections e2
        ON e2.voterID = e1.voterID
     WHERE e1.electionID = 1
       AND e2.electionID = 2
    

    If you are guaranteed that (voterID, ElectionID) is unique, then the select could be simpler:

    SELECT COUNT(1) AS NumVoters
      FROM elections e1
      JOIN elections e2
        ON e2.voterID = e1.voterID
     WHERE e1.electionID = 1
       AND e2.electionID = 2
    
    0 讨论(0)
  • 2021-01-23 10:12
    SELECT COUNT(*)  
      FROM 
         ( SELECT voterid 
             FROM votes 
            WHERE electionid IN(1,2) 
            GROUP 
               BY voterid 
           HAVING COUNT(*) = 2
         ) x;
    

    This assumes that you have a UNIQUE or PRIMARY KEY formed on (voterid,electionid)

    0 讨论(0)
提交回复
热议问题