问题
Schema Description
A project's status can change over time. In order to track the status over time, I've created a many-to-many relationship between the Project
model and the ProjectStatusType
model through the ProjectStatus
intermediary table.
While this allows tracking a project's status over time, it increases the complexity of the schema such that retrieving the current status of a project or retrieving all open projects is more difficult.
Use Case
I want to be able to return all projects that are in a given state, such as all open
projects. For instance, when users go to http://www.example.com/projects, I'd like only the open
projects to be displayed in a table by default.
Questions
- Should I denormalize the schema and add a
current_status
field in theProject
model? - If I shouldn't denormalize, what strategy should I use to retrieve the current status for each project? Should I create a property on the
Project
model that retrieves the current status?
回答1:
If you don't need to search on it, I would create a property on the Project model. You can use the Max
function to aggregate
to get the record with the newest date.
from django.db.models import Max
class Project(models.Model):
[...]
@property
def status_date(self):
return self.projectstatus_set.aggregate(newest=Max('status_date'))['newest']
This strategy is documented here.
If you need to do lookups, then you should denormalize and add a field to Project
. You can keep it current using signals. You would want to add a post_save
listener to your ProjectStatus
field, which would set its project's date to the status'.
from django.db.signals import post_save
def update_status_date(sender, instance=None, **kwargs):
project = instance.project
project.status_date = max(project.status_date, instance.status_date)
project.save()
post_save.connect(update_status_date, sender=ProjectStatus)
You can read more about signals here.
======
EDIT: Since writing my original answer, the OP has clarified his question somewhat, and his clarification alters the example code for both of my strategies, although not their basic construction. I want to leave the original answer for those who may have needs more akin to the question I thought I was answering at the time.
In my first example, he doesn't really want the newest status_date itself, but rather the newest project status type. This would change the property substantially; you don't need to use a MAX()
SQL construct at all; you just want the first record attached to this object when ordered by date descending:
class Project(models.Model):
[...]
@property
def project_status(self):
return self.status.order_by('-status_date')[0]
The use cases around this are still the same. If you will always get a project first and then want to know its current status, this is the right way to go about it. If you need to index projects by status, then you need to denormalize. This is still best done through signals, but instead of saving the date like I was doing in my example above, you probably want to save a description. The principle remains the same, though.
回答2:
By your description, I'm assuming you're in fact using ProjectStatus
as the through
for your ManyToManyField
, and that you're already storing extra data on the relationship with that model. If one of the items of extra data is not already the datetime when the that particular status was set, I would add that to your model.
You can then order ProjectStatus
by that datetime, descending, so the first ProjectStatus
returned will always be the most recent (current).
来源:https://stackoverflow.com/questions/7890103/returning-the-current-project-status-i-e-most-recent-date-on-django-manytoman