问题
I'm experiencing some severe performances issues with prefetch_related
on a Model with 5 m2m fields and I'm pre-fetching also few nested m2m fields.
class TaskModelManager(models.Manager):
def get_queryset(self):
return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).prefetch_related("parent", "takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "status", "approvalWorkflow", "viewers", "requires", "linkedTasks", "activities")
class Task(models.Model):
uuid = models.UUIDField(primary_key=True, default=genOptimUUID, editable=False)
internalStatus = models.IntegerField(default=0)
parent = models.ForeignKey("self", blank=True, null=True, related_name="childs")
name = models.CharField(max_length=45)
taskType = models.ForeignKey("TaskType", null=True)
priority = models.IntegerField()
startDate = models.DateTimeField()
endDate = models.DateTimeField()
status = models.ForeignKey("ProgressionStatus")
assignedUser = models.ForeignKey("Asset", related_name="tasksAssigned")
asset = models.ForeignKey("Asset", related_name="tasksSubject")
viewers = models.ManyToManyField("Asset", blank=True, related_name="followedTasks")
step = models.ForeignKey("Step", blank=True, null=True, related_name="tasks")
approvalWorkflow = models.ForeignKey("ApprovalWorkflow")
linkedTasks = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="linkedTo")
requires = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="depends")
objects = TaskModelManager()
The number of query is fine and the database query time is fine too, for exemple if I query 700 objects of my model i have 35 query and the average query time is 100~200ms but the total request time is approximately 8 seconds.
silk times
I've run some profiling and it pointed out that more than 80% of the time spent was on the prefetch_related_objects
call.
profiling
I'm using Django==1.8.5
and djangorestframework==3.4.6
I'm open to any way to optimize this. Thanks in advance for your help.
Edit with select_related
:
I've tried the improvement proposed by Alasdair
class TaskModelManager(models.Manager):
def get_queryset(self):
return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).select_related("parent", "status", "approvalWorkflow", "step").prefetch_related("takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "viewers", "requires", "linkedTasks", "activities")
The new result is still 8 seconds for the request with 32 queries and 150ms of query time.
Edit :
It seems that a ticket was opened on Django issue tracker 4 years ago and is still open : https: //code.djangoproject.com/ticket/20577
回答1:
I ran into the same problem.
Following the issue you linked i found that you can improve the prefetch_related
performance using Prefetch
object and to_attr
argument.
From the commit that introduces the Prefetch
object:
When a Prefetch instance specifies a to_attr argument, the result is stored in a list rather than a QuerySet. This has the fortunate consequence of being significantly faster. The preformance improvement is due to the fact that we save the costly creation of a QuerySet instance.
So i significantly improved my code (from about 7 seconds to 0.88 seconds) simply by calling:
Foo.objects.filter(...).prefetch_related(Prefetch('bars', to_attr='bars_list'))
instead of
Foo.objects.filter(...).prefetch_related('bars')
回答2:
Try using select_related
for foreign keys like parent
and ApprovalWorkflow
instead of prefetch_related
.
When you use select_related
, Django will fetch the models using a join, unlike prefetch_related
which causes an extra query. You might find that this improves performance.
回答3:
If the DB is 150ms but your request is 8 seconds, it's not your query (in itself, at least). A few possible issues:
1) Your HTML or template is too complex, spending too much time in generating the response. Or consider template caching.
2) All those objects are complex and you load too many fields, so while the query is fast, sending it and processing all those objects in Python is slow. Explore using only(), defer() and values() or value_list() to only load what you need.
Optimization is hard and we'd need more details to give you a better idea. I'd suggest installing Django Debug Toolbar (Django app) or Opbeat (3rd party utility), they can help you detect where your time is spent and then you can optimize accordingly.
来源:https://stackoverflow.com/questions/42838010/django-prefetch-related-optimize-query-but-still-very-slow