mysql - need two limits?

前端 未结 3 762
北海茫月
北海茫月 2020-12-11 08:25

Given a table containing awards earned over several years by members of an organization that consists of multiple geographic districts, what mysql query would show the top a

相关标签:
3条回答
  • 2020-12-11 08:45

    There's a fairly common way to do this, using self-joins. The trick is to replace a search for 'largest' with a search for 'those items with nothing bigger'. As you have already found out

    SELECT Membername,  District, count(Award) as Number FROM awards
    GROUP BY Membername, District
    

    returns you a nice result of award counts. Let's write ... to save a bit of space as shorthand for that.

    Now consider

    SELECT a.Membername, a.District, a.Number FROM (...) a LEFT JOIN (...) b
    ON a.District=b.District
    AND a.Number<b.Number
    WHERE b.Membername IS NULL
    

    where the ... is that stuff written above. It's basically saying, for every entry in the award counts (a), find me all the entries (b) in the same district with more awards, and only return (a) if there aren't any (b)'s... in other words, a is the champ.

    You will need to finesse this a bit if there's more than one member in the same district with the same winning count... this query will return all the tied members. You'll have to decide how to handle that. And watch out for those districts that don't have any awards at all... they won't even appear in your table.

    0 讨论(0)
  • 2020-12-11 08:56

    You'll have to do it by applying a "rank" per district, then only grab per rank = 1... The @LastDistrict at the join location is defaulted to zero, in case the district is based on an ID. If district is char based, you can just change it to = "" instead to match the data type.

    To clarify what is happening. The "AwardCounts" pre query does the entire query per district and member with however many award counts. Then, ordered by district and member award count (descending), thus putting the highest award count at the first position per district.

    That is joined to another bogus alias "SQLVars" which just creates inline variables to the query called @RankSeq and @LastDistrict. So, the first time in, the "DistRankSeq" will become a 1 for the first district, then prime the "@LastDistrict" with the value of the district. The next entry for the same district (since it will be in proper sequence order) will be assigned the rank of 2, then 3, etc... When there is a change from whatever the "LAST" District was to the new record being tested, the rank gets set back to 1 and starts over again. So you could have one district with 100 members, another with 5, another with 17...

    So, your final query has all of them with their respective ranks... Now, apply the HAVING the final district rank = 1... Doing this, you could also adjust the having to get the top 3 members per district (for example)...

    select
          AwardCounts.District,
          AwardCounts.MemberName,
          AwardCounts.memberAwards,
          @RankSeq := if( @LastDistrict = AwardCounts.District, @RankSeq +1, 1 ) DistRankSeq,
          @LastDistrict := AwardCounts.District as ignoreIt
       from
          ( select 
                  a.district,
                  a.membername,
                  count(*) as memberAwards
               from
                  Awards a
               group by
                  a.district,
                  a.membername
               order by
                  a.district,
                  memberAwards desc ) AwardCounts
    
          JOIN (select @RankSeq := 0, @LastDistrict = 0 ) SQLVars
       HAVING
          DistRankSeq = 1
    

    EDIT PER FEEDBACK If its the aggregation thats taking the time, then I would do the following. Create a new table with nothing but the aggregations per district, name and initial rank for the district. As any new record is added to this table, the trigger then adds one to the aggregate table count, then checks where that person is within their district and re-updates its new rank position. You could take it a step further and have another table of just "TOP" member per district table that is one per district with the person's name. When a new person hits the top position, their name is put in the table, overwriting whoever was there last.

    0 讨论(0)
  • 2020-12-11 09:09

    There's a page specifically dedicated to the problem - and if you look at the older manuals you'll see the max-concat trick - which is often still more efficient.

    0 讨论(0)
提交回复
热议问题