Get top n records for each group with Django queryset

ぃ、小莉子 提交于 2021-01-07 01:24:11

问题


I have a model like the following Table,

create table `mytable`
(
  `person` varchar(10),
  `groupname` int,
  `age` int
);

And I want to get the 2 oldest people from each group. The original SQL question and answers are here StackOverflow and One of the solutions that work is

SELECT
    person,
    groupname,
    age
FROM
(
    SELECT
        person,
        groupname,
        age,
        @rn := IF(@prev = groupname, @rn + 1, 1) AS rn,
        @prev := groupname
    FROM mytable
    JOIN (SELECT @prev := NULL, @rn := 0) AS vars
    ORDER BY groupname, age DESC, person
) AS T1
WHERE rn <= 2

You can check the SQL output here as well SQLFIDLE

I just want to know how can I implement this query in Django's views as queryset.


回答1:


Another SQL with similar output would have window function that annotates each row with row number within particular group name and then you would filter row numbers lower or equal 2 in HAVING clause.

At the moment of writing django does not support filtering based on window function result so you need to calculate row in the first query and filter People in the second query.

Following code is based on similar question but it implements limiting number of rows to be returned per group_name.

from django.db.models import F, When, Window
from django.db.models.functions import RowNumber

person_ids = {
    pk
    for pk, row_no_in_group in Person.objects.annotate(
        row_no_in_group=Window(
            expression=RowNumber(), 
            partition_by=[F('group_name')],
            order_by=['group_name', F('age').desc(), 'person']
        )
    ).values_list('id', 'row_no_in_group')
    if row_no_in_group <= 2
}
filtered_persons = Person.objects.filter(id__in=person_ids)

For following state of Person table

>>> Person.objects.order_by('group_name', '-age', 'person').values_list('group_name', 'age', 'person')
<QuerySet [(1, 19, 'Brian'), (1, 17, 'Brett'), (1, 14, 'Teresa'), (1, 13, 'Sydney'), (2, 20, 'Daniel'), (2, 18, 'Maureen'), (2, 14, 'Vincent'), (2, 12, 'Carlos'), (2, 11, 'Kathleen'), (2, 11, 'Sandra')]>

queries above return

>>> filtered_persons.order_by('group_name', '-age', 'person').values_list('group_name', 'age', 'person')
<QuerySet [(1, 19, 'Brian'), (1, 17, 'Brett'), (2, 20, 'Daniel'), (2, 18, 'Maureen')]>


来源:https://stackoverflow.com/questions/65324003/get-top-n-records-for-each-group-with-django-queryset

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!