SQL sorting does not follow group by statement, always uses primary key

前端 未结 4 1048
清酒与你
清酒与你 2021-01-15 06:40

I have a SQL database with a table called staff, having following columns:

workerID (Prim.key), name, department, salary

I am

相关标签:
4条回答
  • 2021-01-15 06:51

    My favorite solution to this problem uses LEFT JOIN:

    SELECT m.workerID, m.name, m.department, m.salary
    FROM staff m             # 'm' from 'maximum'
        LEFT JOIN staff o    # 'o' from 'other'
            ON m.department = o.department    # match rows by department
            AND m.salary < o.salary           # match each row in `m` with the rows from `o` having bigger salary
    WHERE o.salary IS NULL       # no bigger salary exists in `o`, i.e. `m`.`salary` is the maximum of its dept.
    ;
    

    This query selects all the workers that have the biggest salary from their department; i.e. if two or more workers have the same salary and it is the bigger in their department then all these workers are selected.

    0 讨论(0)
  • 2021-01-15 06:56

    Try this:

    SELECT s.workerID, s.name, s.department, s.salary
    FROM staff s 
    INNER JOIN (SELECT s.department, MAX(s.salary) AS biggest 
                FROM staff s GROUP BY s.department
              ) AS B ON s.department = B.department AND s.salary = B.biggest;
    

    OR

    SELECT s.workerID, s.name, s.department, s.salary
    FROM (SELECT s.workerID, s.name, s.department, s.salary 
          FROM staff s 
          ORDER BY s.department, s.salary DESC
        ) AS s 
    GROUP BY s.department;
    
    0 讨论(0)
  • 2021-01-15 06:59

    This is the usual case group by with a aggregate function does not guarantee proper row corresponding to the aggregate function. Now there are many ways to do it and the usual practice is a sub-query and join. But if the table is big then performance wise it kills, so the other approach is to use left join

    So lets say we have the table

    +----------+------+-------------+--------+
    | workerid | name | department  | salary |
    +----------+------+-------------+--------+
    |        1 | abc  | computer    |    400 |
    |        2 | cdf  | electronics |    200 |
    |        3 | gfd  | computer    |    400 |
    |        4 | wer  | physics     |    300 |
    |        5 | hgt  | computer    |    700 |
    |        6 | juy  | electronics |    100 |
    |        7 | wer  | physics     |    400 |
    |        8 | qwe  | computer    |    200 |
    |        9 | iop  | electronics |    800 |
    |       10 | kli  | physics     |    800 |
    |       11 | qsq  | computer    |    600 |
    |       12 | asd  | electronics |    300 |
    +----------+------+-------------+--------+
    

    SO we can get the data as

    select st.* from staff st
    left join staff st1 on st1.department = st.department
    and st.salary < st1.salary
    where 
    st1.workerid is null
    

    The above will give you as

    +----------+------+-------------+--------+
    | workerid | name | department  | salary |
    +----------+------+-------------+--------+
    |        5 | hgt  | computer    |    700 |
    |        9 | iop  | electronics |    800 |
    |       10 | kli  | physics     |    800 |
    +----------+------+-------------+--------+
    
    0 讨论(0)
  • 2021-01-15 07:10

    Explanation for what is going on:

    You are performing a GROUP BY on staff.department, however your SELECT list contains 2 non-grouping columns staff.workerID, staff.name. In standard sql this is a syntax error, however MySql allows it so the query writers have to make sure that they handle such situations themselves.

    Reference: http://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html

    In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the select list that are not named in the GROUP BY clause.

    MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause.

    The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

    Starting with MySQL 5.1 the non-standard feature can be disabled by setting the ONLY_FULL_GROUP_BY flag in sql_mode: http://dev.mysql.com/doc/refman/5.6/en/sql-mode.html#sqlmode_only_full_group_by

    How to fix:

    select staff.workerID, staff.name, staff.department, staff.salary
    from staff
    join (
      select staff.department, max(staff.salary) AS biggest
      from staff
      group by staff.department
    ) t
    on t.department = staff.department and t.biggest = staff.salary
    

    In the inner query, fetch department and its highest salary using GROUP BY. Then in the outer query join those results with the main table which would give you the desired results.

    0 讨论(0)
提交回复
热议问题