问题
In my Django application, I'm trying to get a Count of all Student submitted Papers, including students who have submitted NO papers (represented as count=0).
models.py
class Student(models.Model):
idstudent = models.AutoField(primary_key=True)
student_name = models.CharField(max_length=250, null=False, blank=False, verbose_name='Student Name')
class Paper(models.Model):
idpaper = models.AutoField(primary_key=True)
student = models.ForeignKey(Student, on_delete=models.PROTECT, null=False, blank=False)
Query Attempt 1: Returns only Students who have submitted Papers
papers = Paper.objects.order_by('submission_date')
result = papers.values('student', student_name=F('student__student_name')).annotate(count=Count('student')).distinct().order_by('-count')
print(result)
<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 4}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 2}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}]>
Query Attempt 2: Returns Students who have submitted 0 Papers, but the Count for every other Student is 1
result = Student.objects.values('pk', student_name=F('student_name'))
.annotate(
count=Count(
'pk',
filter=Q(pk__in=Paper.objects.values('student')
)
)
)
).order_by('-count')
print(result)
<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>
Along the same lines as Attempt 2, I also tried the following using Sum(Case(
which yielded the same result, as I recognized that the Attempt 2 raw SQL actually utilizes Case(When
, but seems to only count when Student.pk
is present in the Paper.objects.values
"list" (while not accounting for how many times it is present).
result = Student.objects.values('pk', student_name=F('student_name')).annotate(
count=Sum(
Case(
When(pk__in=Paper.objects.values('student'), then=1),
default=0, output_field=IntegerField()
)
)
)
<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>
How might I adjust my query to include students who have submitted 0 papers while also maintaining the correct counts for students who have?
回答1:
Along the same lines as Attempt 2, I also tried the following using Sum(Case( which yielded the same result, as I recognized that the Attempt 2 raw SQL actually utilizes Case(When, but seems to only count when Student.pk is present in the Paper.objects.values "list" (while not accounting for how many times it is present).
Either I'm not understanding the problem/question, but your Attempt 2 example is filtering the count to only Paper.objects.values "list"
, its normal to act like this ?
Have you tried with the simple:
Student.objects.annotate(num_papers=Count('paper'))
If you want to make an additional filter on the count, my suggestion is to use subqueries here is an example:
Student.objects.annotate(
num_papers=Subquery(
Paper.objects.filter(student=OuterRef('pk'))
# The first .values call defines our GROUP BY clause
# Its important to have a filtration on every field defined here
# Otherwise you will have more than one row per group!
# In this example we group only by student
# and we already filtered by student.
# any extra filtration you want should be make here too (before the grouping).
.values('student')
# Here we say: count how many rows we have per group
.annotate(cnt=Count('pk'))
# Here we say: return only the count
.values('cnt')
)
)
来源:https://stackoverflow.com/questions/62317457/django-aggregate-query-include-zero-count