问题
This is a continuation question from:
Django queryset get distinct column values with respect to other column
My Problem:
Using aggregate count in Django counts the wrong thing in the queryset, or as far as I can see something that is not even in my queryset.
What I did I used:
queryset.order_by('col1', 'col2').distinct('col1', 'col2').values('col2')
to get the values of col2
of a model where all the rows have a distinct pair of values in (col1, col2)
. There is an example in the link above. I printed my queryset and it looks good, I have
[{'col2': value1}, ... , {'col2': value1},{'col2': value2}, ..., {'col2': value2},...]
I now want to count how much each value appears in the queryset I got. I do this using aggregation count. I have:
a = {'count': Count(F(col2), distinct=False)}
queryset.annotate(**a)
I tried this with ``distinct=True` as well but no luck
I would expect to get [{col2:value1, count: num1}, {col2:value2, count: num2} ...]
.
Instead I get [{col2: value1, count: num11}, {col2: value1, count: num12}, ... ,{col2: value1, count: num_1n}, {col2: value2, count: num21}, ... ,{col2: value1, count: num_2n}, ...]
.
Where as far as I can tell num11, ...., num_1n are the amount of lines value1 existed in col2 with any specific value in col1, previous to me using order_by('col1', 'col2').distinct('col1', 'col2').values('col2')
when preparing the query.
I can't figure out what can cause this. I tried looking in the queryset.query
parameter but I can't understand where I am going wrong.
Any help would be greatly appreciated.
回答1:
The .order_by
should specify only 'col2'
, like:
queryset.values('col2').annotate(
count=Count('col1', distinct=True)
).order_by('col2')
This will thus yield a QuerySet
that looks like:
< QuerySet [
{'col2': 1, 'count': 4 },
{'col2': 2, 'count': 2 }
]>
So that means that there are four distinct values for col1
given col2
has value 1
, and two distinct values for col1
given col2
has value 2
.
This will construct a query like:
SELECT col2, COUNT(DISTINCT col1) AS count
FROM some_table
GROUP BY col2
ORDER BY col2
The .distinct(..)
is here not necessary since due to the GROUP BY
we will only obtain distinct col2
values, and because we COUNT(DISTINCT ..)
this thus means that each distinct value of col1
is counted once.
来源:https://stackoverflow.com/questions/52907276/django-queryset-aggregation-count-counting-wrong-thing