Trying to understand over() and partition by

前端 未结 2 771
被撕碎了的回忆
被撕碎了的回忆 2021-02-01 05:16

I am trying to get the over and partition by functionality wrapped around my head. Here is an example that I just do not understand.

Here is the data I have:

<         


        
相关标签:
2条回答
  • 2021-02-01 05:50

    Try ordering by order date, you'll see the results more easily

    select Row_Number() over(partition by orderdate order by orderdate asc) 
        as Rownumber, salesorderid, orderdate
    from test2
    order by orderdate;
    

    should give (i've added blank lines for clarity)

    ROWNUMBER     SALESORDERID       ORDERDATE
    1             43664              07/01/2001
    
    1             43911              08/01/2001
    2             43867              08/01/2001
    3             43895              08/01/2001
    4             43894              08/01/2001
    5             43877              08/01/2001
    
    1             44109              09/01/2001
    
    1             44285              10/01/2001
    
    1             44483              11/01/2001
    2             44501              11/01/2001
    

    You'll notice that the result is divided into 'partitions', each partition being the set of rows with identical orderdates. That is what 'partition by orderdate' means.

    Within a partition, the rows are ordered by orderdate, as per the second clause of '(partition by orderdate order by orderdate asc)'. That isn't very useful, as all rows within a partition are going to have the same orderdate. Because of that, the ordering of the rows within a partition is random. Try ordering by salesorderid within the partition by clause to have a more reproducable result.

    row_number() just returns the row's ordering within each partition

    0 讨论(0)
  • 2021-02-01 06:01

    The partition by orderdate means that you're only comparing records to other records with the same orderdate. For example, of the five records with orderdate = '08/01/2001', one will have row_number() = 1, one will have row_number() = 2, and so on.

    The order by orderdate asc means that, within a partition, row-numbers are to be assigned in order of orderdate. In your example that has no effect, because you're already partitioning by orderdate, so all records within a partition will have the same orderdate. (It would be like writing SELECT ... FROM t WHERE c = 6 ORDER BY c: all selected records have the same value of c, so the ORDER BY c does nothing.) So, within a partition, the assignment of row_number() is arbitrary: each row will have a different number, but there are no guarantees about which row will have which number.

    0 讨论(0)
提交回复
热议问题