Django: Distinct foreign keys

后端 未结 4 761
名媛妹妹
名媛妹妹 2021-01-17 10:02
class Log:
 project = ForeignKey(Project)
 msg = CharField(...)
 date = DateField(...)

I want to select the four most recent Log entries where each

相关标签:
4条回答
  • 2021-01-17 10:27

    Actually, you can get the project_ids in SQL. Assuming that you want the unique project ids for the four projects with the latest log entries, the SQL would look like this:

    SELECT project_id, max(log.date) as max_date
    FROM logs
    GROUP BY project_id
    ORDER BY max_date DESC LIMIT 4;
    

    Now, you actually want all of the log information. In PostgreSQL 8.4 and later you can use windowing functions, but that doesn't work on other versions/databases, so I'll do it the more complex way:

    SELECT logs.*
    FROM logs JOIN (
        SELECT project_id, max(log.date) as max_date
        FROM logs
        GROUP BY project_id
        ORDER BY max_date DESC LIMIT 4 ) as latest
    ON logs.project_id = latest.project_id
       AND logs.date = latest.max_date;
    

    Now, if you have access to windowing functions, it's a bit neater (I think anyway), and certainly faster to execute:

    SELECT * FROM (
       SELECT logs.field1, logs.field2, logs.field3, logs.date
           rank() over ( partition by project_id 
                         order by "date" DESC ) as dateorder
       FROM logs ) as logsort
    WHERE dateorder = 1
    ORDER BY logs.date DESC LIMIT 1;
    

    OK, maybe it's not easier to understand, but take my word for it, it runs worlds faster on a large database.

    I'm not entirely sure how that translates to object syntax, though, or even if it does. Also, if you wanted to get other project data, you'd need to join against the projects table.

    0 讨论(0)
  • 2021-01-17 10:35

    You need two querysets. The good thing is it still results in a single trip to the database (though there is a subquery involved).

    latest_ids_per_project = Log.objects.values_list(
        'project').annotate(latest=Max('date')).order_by(
        '-latest').values_list('project')
    
    log_objects = Log.objects.filter(
         id__in=latest_ids_per_project[:4]).order_by('-date')
    

    This looks a bit convoluted, but it actually results in a surprisingly compact query:

    SELECT "log"."id",
           "log"."project_id",
           "log"."msg"
           "log"."date"
    FROM "log"
    WHERE "log"."id" IN
        (SELECT U0."id"
         FROM "log" U0
         GROUP BY U0."project_id"
         ORDER BY MAX(U0."date") DESC
         LIMIT 4)
    ORDER BY "log"."date" DESC
    
    0 讨论(0)
  • 2021-01-17 10:36

    I know this is an old post, but in Django 2.0, I think you could just use:

    Log.objects.values('project').distinct().order_by('project')[:4]
    
    0 讨论(0)
  • 2021-01-17 10:47

    Queries don't work like that - either in Django's ORM or in the underlying SQL. If you want to get unique IDs, you can only query for the ID. So you'll need to do two queries to get the actual Log entries. Something like:

    id_list = Log.objects.order_by('-date').values_list('project_id').distinct()[:4]
    entries = Log.objects.filter(id__in=id_list)
    
    0 讨论(0)
提交回复
热议问题