Can I use non-aggregate columns with group by?

前端 未结 6 717
不思量自难忘°
不思量自难忘° 2020-12-09 09:24

You cannot (should not) put non-aggregates in the SELECT line of a GROUP BY query.

I would however like access the one of the non-aggregate

相关标签:
6条回答
  • 2020-12-09 09:53

    You cannot (should not) put non-aggregates in the SELECT line of a GROUP BY query.

    You can, and have to, define what you are grouping by for the aggregate function to return the correct result.

    MySQL (and SQLite) decided in their infinite wisdom that they would go against spec, and allow queries to accept GROUP BY clauses missing columns quoted in the SELECT - it effectively makes these queries not portable.

    It really seems like there should be a way to get this information without needing to join.

    Without access to the analytic/ranking/windowing functions that MySQL doesn't support, the self join to a derived table/inline view is the most portable means of getting the result you desire.

    0 讨论(0)
  • 2020-12-09 10:06

    PostgesSQL's DISTINCT ON will be useful here.

    SELECT DISTINCT ON (kind) kind, id, age 
    FROM stuff
    ORDER BY kind, age DESC;
    

    This groups by kind and returns the first row in the ordered format. As we have ordered by age in descending order, we will get the row with max age for kind.

    P.S. columns in DISTINCT ON should appear first in order by

    0 讨论(0)
  • 2020-12-09 10:10

    In recent databases you can use sum() over (parition by ...) to solve this problem:

    select id, kind, age as max_age from (
      select id, kind, age, max(age) over (partition by kind) as mage
        from table)
    where age = mage
    

    This can then be single pass

    0 讨论(0)
  • 2020-12-09 10:12

    You can't get the Id of the row that MAX found, because there might not be only one id with the maximum age.

    0 讨论(0)
  • 2020-12-09 10:13

    I think it's tempting indeed to ask the system to solve the problem in one pass rather than having to do the job twice (find the max, and the find the corresponding id). You can do using CONCAT (as suggested in Naktibalda refered article), not sure that would be more effeciant

    SELECT MAX( CONCAT( LPAD(age, 10, '0'), '-', id)
    FROM STUFF1
    GROUP BY kind;
    

    Should work, you have to split the answer to get the age and the id. (That's really ugly though)

    0 讨论(0)
  • 2020-12-09 10:13

    You have to have a join because the aggregate function max retrieves many rows and chooses the max. So you need a join to choose the one that the agregate function has found.

    To put it a different way how would you expect the query to behave if you replaced max with sum?

    An inner join might be more efficient than your sub query though.

    0 讨论(0)
提交回复
热议问题