Oracle Analytic function for min value in grouping

前端 未结 4 1431
遥遥无期
遥遥无期 2021-01-12 08:06

I\'m new to working with analytic functions.

DEPT EMP   SALARY
---- ----- ------
  10 MARY  100000
  10 JOHN  200000
  10 SCOTT 300000
  20 BOB   100000
  20 BET         


        
相关标签:
4条回答
  • 2021-01-12 08:31

    You can use the RANK() syntax. For example, this query will tell you where an employee ranks within their department with regard to how large their salary is:

    SELECT
      dept,
      emp,
      salary,
      (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
    FROM EMPLOYEES
    

    You could then query from this where salary_rank_within_dept = 1:

    SELECT * FROM
      (
        SELECT
          dept,
          emp,
          salary,
          (RANK() OVER (PARTITION BY dept ORDER BY salary)) salary_rank_within_dept
        FROM EMPLOYEES
      )
    WHERE salary_rank_within_dept = 1
    
    0 讨论(0)
  • 2021-01-12 08:31
    select e2.dept, e2.emp, e2.salary
    from employee e2
    where e2.salary = (select min(e1.salary) from employee e1)
    
    0 讨论(0)
  • 2021-01-12 08:33

    I think that the Rank() function is not the way to go with this, for two reasons.

    Firstly, it is probably less efficient than a Min()-based method.

    The reason for this is that the query has to maintain an ordered list of all salaries per department as it scans the data, and the rank will then be assigned later by re-reading this list. Obviously in the absence of indexes that can be leveraged for this, you cannot assign a rank until the last data item has been read, and maintenance of the list is expensive.

    So the performance of the Rank() function is dependent on the total number of elements to be scanned, and if the number is sufficient that the sort spills to disk then performance will collapse.

    This is probably more efficient:

    select dept,
           emp,
           salary
    from
           (
           SELECT dept, 
                  emp,
                  salary,
                  Min(salary) Over (Partition By dept) min_salary
           FROM   mytable
           )
    where salary = min_salary
    /
    

    This method only requires that the query maintain a single value per department of the minimum value encountered so far. If a new minimum is encountered then the existing value is modified, otherwise the new value is discarded. The total number of elements that have to be held in memory is related to the number of departments, not the number of rows scanned.

    It could be that Oracle has a code path to recognise that the Rank does not really need to be computed in this case, but I wouldn't bet on it.

    The second reason for disliking Rank() is that it just answers the wrong question. The question is not "Which records have the salary that is the first ranking when the salaries per department are ascending ordered", it is "Which records have the salary that is the minimum per department". That makes a big difference to me, at least.

    0 讨论(0)
  • 2021-01-12 08:48

    I think you were pretty close with your original query. The following would run and do match your test case:

    SELECT dept, 
      MIN(emp) KEEP(DENSE_RANK FIRST ORDER BY salary, ROWID) AS emp,
      MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY salary, ROWID) AS salary
    FROM mytable
    GROUP BY dept
    

    In contrast to the RANK() solutions, this one guarantees at most one row per department. But that hints at a problem: what happens in a department where there are two employees on the lowest salary? The RANK() solutions will return both employees -- more than one row for the department. This answer will pick one arbitrarily and make sure there's only one for the department.

    0 讨论(0)
提交回复
热议问题