I have the following models in Django (simplified for brevity):
class DistinctWord(models.Model):
...
class Word(models.Model):
distinct_word = models.F
How about (given your queryset of uw
): [obj.words.first() for obj in uw]
Let:
uw # be a given queryset of UserWord's
dw # be a queryset of DistinctWords (will be derived from `uw`)
w # be a queryset of Words needed (will be derived from `dw`)
Each UserWord
has a DistinctWord
, and each DistinctWord
has many Word
's (loosely denoted as uw>dw<w
).
Here is my answer:
dw_id=uw.values_list('distinct_word_id', flat=True) # 1: get dw ids from uw
dw=DistinctWord.objects.filter(id__in=dw_id) # 2: get dw's
w_first_id=dw.annotate(first_word=Min('words')).values_list('first_word', flat=True)
# 3: find id of first word
w=Word.objects.filter(id__in=w_first_id) # 4: get first words
In summary: lines 1 and 2 get dw
and should be just 1 trip to the database
line 3 uses annotate
followed by values_list
to find the id of first related Word
Line 4 brings the actual Word objects from the id's generated in the previous step. Lines 3 and 4 should be another trip to the database since annotate
is not a terminal statement.
Thus 2 trips to the database (not tested).
You can do this using the Subquery API:
from django.db.models.expressions import Subquery, OuterRef
first_word = Word.objects.filter(
distinct_word=OuterRef('distinct_word')
).order_by('pk').values('pk')[:1]
UserWord.objects.filter(
# whatever filters...
).annotate(
first_word=Subquery(first_word)
)
This will result in SQL that looks something like:
SELECT user_word.*,
(SELECT word.id
FROM word
WHERE word.distinct_word_id = user_word.distinct_word_id
) AS first_word
FROM user_word
WHERE ...
This will probably not perform as well as a JOIN with a DISTINCT ON in postgres, and may not perform as well as a JOIN with a GROUP BY, as it will need to execute the subquery for each row.