Order queryset by alternating value

后端 未结 3 702
余生分开走
余生分开走 2020-12-09 05:27

I have the following model:

class Entry(models.Model):
    name = models.Charfield(max_length=255)
    client = models.Charfield(max_length=255)
相关标签:
3条回答
  • 2020-12-09 06:06

    Thats not the most performant way, but works:

    from itertools import zip_longest
    from django.db.models import Case, When
    
    grouped_pks = []
    for client in Entry.objects.values_list('client', flat=True).distinct():
        grouped_pks.append(
            Entry.objects.filter(client=client).values_list('pk', flat=True)
        )
    
    alternated_pks = [
        x for x in
        filter(
            None,
            sum(zip_longest(*grouped_pks), ())
        )
    ]
    alternated_pks_order = Case(
        *[
            When(pk=pk, then=position)
            for position, pk in enumerate(alternated_pks)
        ]
    )
    
    entries = Entry.objects.filter(pk__in=alternated_pks).order_by(alternated_pks_order)
    for entry in entries:
        print('id: {} - client: {}'.format(entry.id, entry.client))
    

    Expected output:

    id: 8901 - client: google
    id: 8890 - client: facebook
    id: 8884 - client: google
    id: 8894 - client: facebook
    id: 8748 - client: google
    id: 8891 - client: facebook
    id: 8906 - client: google
    id: 8909 - client: facebook
    id: 8888 - client: google
    id: 8895 - client: facebook
    id: 8919 - client: google
    id: 8910 - client: facebook
    id: 8878 - client: google
    id: 8896 - client: facebook
    id: 8916 - client: google
    id: 8902 - client: facebook
    id: 8917 - client: google
    id: 8885 - client: facebook
    id: 8918 - client: google
    id: 8903 - client: facebook
    id: 8920 - client: google
    id: 8886 - client: facebook
    id: 8904 - client: facebook
    id: 8905 - client: facebook
    id: 8887 - client: facebook
    id: 8911 - client: facebook
    id: 8897 - client: facebook
    id: 8912 - client: facebook
    id: 8898 - client: facebook
    id: 8899 - client: facebook
    id: 8914 - client: facebook
    id: 8900 - client: facebook
    id: 8915 - client: facebook
    

    This is python3 code, but if you want to use it with python 2, change the zip_longest function to izip_longest.

    This code is nice, because we still work with Queryset, so all other Sorting, Ordering, Managers, Pagination and other stuff will still work.

    0 讨论(0)
  • 2020-12-09 06:14

    Maybe the following will work, with numpy module

    import numpy
    queryset = Entry.objects.all()
    
    ''' catch the number of distinct Entry considering the client fields '''
    
    queryset_client_entries = Entry.objects.distinct('client')
    
    new_list = list(numpy.resize(queryset_client_entries, queryset.count())) 
    ''' An alternative list of clients depending on the length of queryset '''  
    
    0 讨论(0)
  • 2020-12-09 06:29

    Since you use Postgres you can use its Window Functions which perform a calculation across a set of table rows that are somehow related to the current row. Another good information relies in the fact that you use Django2.x which supports Window Functions(Django docs) which allows adding an OVER clause to Querysets.

    Your use-case can be resolved with Single ORM query like:

    from django.db.models.expressions import Window
    from django.db.models.functions import RowNumber
    from django.db.models import F
    
    results = Entry.objects.annotate(row_number=Window(
        expression=RowNumber(),
        partition_by=[F('client')],
        order_by=F('created').desc())
    ).order_by('row_number', 'client')
    
    for result in results:
        print('Id: {} - client: {} - row_number {}'.format(result.id, result.client, result.row_number))
    

    Output:

    Id: 12 - client: facebook - row_number 1
    Id: 13 - client: google - row_number 1
    Id: 11 - client: facebook - row_number 2
    Id: 8 - client: google - row_number 2
    Id: 10 - client: facebook - row_number 3
    Id: 5 - client: google - row_number 3
    Id: 9 - client: facebook - row_number 4
    Id: 3 - client: google - row_number 4
    Id: 7 - client: facebook - row_number 5
    Id: 2 - client: google - row_number 5
    Id: 6 - client: facebook - row_number 6
    Id: 1 - client: google - row_number 6
    Id: 4 - client: facebook - row_number 7
    

    The raw SQL looks like:

    SELECT 
    "orm_entry"."id",
    "orm_entry"."name",
    "orm_entry"."client",
    "orm_entry"."created",
    ROW_NUMBER() OVER (PARTITION BY "orm_entry"."client" ORDER BY "orm_entry"."created" DESC) AS "row_number" 
    FROM "orm_entry" 
    ORDER BY "row_number" ASC, "orm_entry"."client" ASC
    

    Window functions are declared just as an aggregate function followed by an OVER clause, which indicates exactly how rows are being grouped. The group of rows onto which the window function is applied is called "partition".

    You can notice that we grouped the rows by 'client' field and you can conclude that in our example we will have the two partitions. First partition will contain all the 'facebook' entries and second partition will contain all the 'google' entries. In its basic form, a partition is no different than a normal aggregate function group: simply a set of rows considered "equal" by some criteria, and the function will be applied over all these rows to return a single result.

    In your example we can use the row_number window function which simply returns the index of the current row within its partition starting from 1. That helped me to establish the alternating output in order_by('row_number', 'client').

    Additional information:

    If you want to achieve an order like this:

    'facebook','facebook', 'google','google','facebook','facebook','google','google'

    or

    'facebook','facebook','facebook','google','google','google','facebook', 'facebook','facebook'

    You will need to do one small math related modification of the previous query like:

    GROUP_SIZE = 2
    results = Entry.objects.annotate(row_number=Window(
        expression=RowNumber(),
        partition_by=[F('client')],
        order_by=F('created').desc())
    ).annotate(row_number=(F('row_number') - 1)/GROUP_SIZE + 1).order_by('row_number', 'client')
    
    for result in results:
        print('Id: {} - client: {} - row_number {}'.format(result.id, result.client, result.row_number))
    

    Output:

    Id: 12 - client: facebook - row_number 1
    Id: 11 - client: facebook - row_number 1
    Id: 8 - client: google - row_number 1
    Id: 13 - client: google - row_number 1
    Id: 10 - client: facebook - row_number 2
    Id: 9 - client: facebook - row_number 2
    Id: 3 - client: google - row_number 2
    Id: 5 - client: google - row_number 2
    Id: 7 - client: facebook - row_number 3
    Id: 6 - client: facebook - row_number 3
    Id: 1 - client: google - row_number 3
    Id: 2 - client: google - row_number 3
    Id: 4 - client: facebook - row_number 4
    

    You can notice that GROUP_SIZE constant defines how many items will be in the each alternating group.

    P.S.

    Thank you for asking this question because it helped me to better understand the Window Functions.

    Happy coding :)

    0 讨论(0)
提交回复
热议问题