I have a function that looks like this:
def post_count(self):
return self.thread_set.aggregate(num_posts=Count(\'post\'))[\'num_posts\']
<
Yes. Just do it. This should work as expected:
self.thread_set.filter(active_status=1).aggregate(num_posts=Count('post'))['num_posts']
Any original query returns a QuerySet
, so any available methods that return QuerySets can be can be pretty much indefinitely chained together for complex criteria matches. Since aggregate()
does not return a QuerySet, you want to make sure that it is last in the chain.
You may want to look at writing a custom Manager object:
http://docs.djangoproject.com/en/1.1/topics/db/managers/
I haven't used aggregate()
, but that may let you write a custom manager to provide a filtered active_thread_set
and then do self.active_thread_set.aggregate(...)
. If not, it will let you do the custom SQL and add a num_posts
property onto the Thread
objects (see the PollManager.with_counts()
example.)
I have been looking into something similar and have not found a great solution. I'm using something like this:
def post_count(self):
return len(Post.objects.filter(someModel = self).filter(active_status = 1))
It's not great, but I don't think that Django allows you to filter based on the secondary model aggregations and annotations. I'll be checking to see if anyone comes up with a better solution.
OK, now that the question includes the model definitions, I submit to you that this should work, unless your version of Django doesn't support some feature I use here (in which case, please let me know!):
Post.objects.filter(thread__in=thread_set, status='active').aggregate(num_posts=Count('id'))
Django allows __in
filters to take a QuerySet to decide what the IN
clause should look like in SQL, so if you pass thread__in=thread_set
, Django will filter the posts so that only those whose thread
field points to one of the id
s of the threads in your thread_set
remain for the aggregate
call to see.
This should filter the posts with just one db query with something like WHERE thread_id IN ...
inside, rather than with one query per thread, which would indeed be horrid. If anything else happened, this would be a bug in Django...
The result should be at most two queries to establish a Category
's postcount -- one to obtain thread_set
and another one actually to count the posts. The alternative is to have a thread/post join to be filtered based on Thread
's category
field and Post
's status
field, which I wouldn't necessarily expect to be that much faster. (I say 'at most', because I guess they could be fused automatically... Though I don't think this would happen with current Django. Can't check ATM, sorry.)
EDIT: Django's QuerySet API reference says this on __in
filters:
IN
In a given list.
Example:
Entry.objects.filter(id__in=[1, 3, 4])
SQL equivalent:
SELECT ... WHERE id IN (1, 3, 4);
You can also use a queryset to dynamically evaluate the list of values instead of providing a list of literal values:
inner_qs = Blog.objects.filter(name__contains='Cheddar')
entries = Entry.objects.filter(blog__in=inner_qs)
This queryset will be evaluated as subselect statement:
SELECT ... WHERE blog.id IN (SELECT id FROM ... WHERE NAME LIKE '%Cheddar%')
The above code fragment could also be written as follows:
inner_q = Blog.objects.filter(name__contains='Cheddar').values('pk').query
entries = Entry.objects.filter(blog__in=inner_q)
Changed in Django 1.1: In Django 1.0, only the latter piece of code is valid.
This second form is a bit less readable and unnatural to write, since it accesses the internal query attribute and requires a ValuesQuerySet. If your code doesn't require compatibility with Django 1.0, use the first form, passing in a queryset directly.
So, I guess Django is capable of passing a single query to the db in the case at issue here. If the db's query analyser does a good job, the effect might be very nearly optimal. :-)
Would it be ok to change around things a bit?
As illustrated below, you could add a post_count property to the Thread class, which counts active Posts in a Thread.
This post_count could then be used to calculate active posts in a category by adding up all active posts in all thread in a category.
class Category(models.Model):
name = models.CharField(max_length=100)
slug = models.SlugField(max_length=100, blank=True, primary_key=True)
ordering = models.IntegerField(max_length=3, default=0)
@property
def thread_count(self):
return self.thread_set.all().count()
@property
def post_count(self): # <-- Changed
return reduce(lambda x,y: x + y, [x.post_count for x in self.thread_set.all()])
class Thread(models.Model):
user = models.ForeignKey(User)
category = models.ForeignKey(Category)
title = models.CharField(max_length=100)
slug = models.SlugField(max_length=100)
content = models.TextField()
created = models.DateTimeField(auto_now_add=True)
latest_activity = models.DateTimeField(auto_now_add=True)
@property
def post_count(self): # <---- Newly added
return self.post_set.filter(status = 'ACTIVE').count()
class Post(models.Model):
thread = models.ForeignKey(Thread)
parent = models.ForeignKey('Post', null=True, blank=True)
display_name = models.CharField(max_length=100)
email = models.EmailField(db_index=True)
ip_address = models.IPAddressField(null=True, blank=True)
content = models.TextField()
status = models.CharField(choices=STATUS_CHOICES, max_length=25, db_index=True, default='approved')
created = models.DateTimeField()