问题
I have a panel data set: that is, times
, ids
, and values
. I would like to do a ranking based on value for each date. I can achieve the sort very simply by running:
select * from tbl order by date, value
The issue I have is once the table is sorted in this way, how do I retrieve the row number of each group (that is, for each date I would like there to be a column called ranking that goes from 1 to N).
Example:
Input:
Date, ID, Value
d1, id1, 2
d1, id2, 1
d2, id1, 10
d2, id2, 11
Output:
Date, ID, Value, Rank
d1, id2, 1, 1
d1, id1, 2, 2
d2, id1, 10, 1
d2, id2, 11, 2
回答1:
Absent window functions, you can order tbl
and use user variables to compute rank over your partitions ("date" values) yourself:
SELECT "date", -- D) Desired columns
id,
value,
rank
FROM (SELECT "date", -- C) Rank by date
id,
value,
CASE COALESCE(@partition, "date")
WHEN "date" THEN @rank := @rank + 1
ELSE @rank := 1
END AS rank,
@partition := "date" AS dummy
FROM (SELECT @rank := 0 AS rank, -- A) User var init
@partition := NULL AS partition) dummy
STRAIGHT_JOIN
( SELECT "date", -- B) Ordering query
id,
value
FROM tbl
ORDER BY date, value) tbl_ordered;
Update
So, what is that query doing?
We are using user variables to "loop" through a sorted result set, incrementing or resetting a counter (@rank
) depending upon which contiguous segment of the result set (tracked in @partition
) we're in.
In query A we initialize two user variables. In query B we get the records of your table in the order we need: first by date and then by value. A and B together make a derived table, tbl_ordered
, that looks something like this:
rank | partition | "date" | id | value
---- + --------- + ------ + ---- + -----
0 | NULL | d1 | id2 | 1
0 | NULL | d1 | id1 | 2
0 | NULL | d2 | id1 | 10
0 | NULL | d2 | id2 | 11
Remember, we don't really care about the columns dummy.rank
and dummy.partition
— they're just accidents of how we initialize the variables @rank
and @partition
.
In query C we loop through the derived table's records. What we're doing is more-or-less what the following pseudocode does:
rank = 0
partition = nil
foreach row in fetch_rows(sorted_query):
(date, id, value) = row
if partition is nil or partition == date:
rank += 1
else:
rank = 1
partition = date
stdout.write(date, id, value, rank, partition)
Finally, query D projects all columns from C except for the column holding @partition
(which we named dummy
and do not need to display).
回答2:
I know this is an old question but here is a shorter answer:
SELECT w.*, if(
@preDate = w.date,
@rank := @rank + 1,
@rank := (@preDate :=w.date) = w.date
) rank
FROM tbl w
JOIN (SELECT @preDate := '' )a
ORDER BY date, value
回答3:
Would this do the trick?
select [DATE],ID,Value,
(DENSE_RANK() OVER (
PARTITION BY ID
ORDER BY Date) )AS [DenseRank],
ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY [Date] DESC) AS RN
from SomeTable
来源:https://stackoverflow.com/questions/8394295/within-group-sorts-in-mysql