Django prefetch_related optimize query but still very slow

大城市里の小女人 提交于 2020-06-12 07:20:36

问题


I'm experiencing some severe performances issues with prefetch_related on a Model with 5 m2m fields and I'm pre-fetching also few nested m2m fields.

class TaskModelManager(models.Manager):
    def get_queryset(self):
        return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).prefetch_related("parent", "takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "status", "approvalWorkflow", "viewers", "requires", "linkedTasks", "activities")


class Task(models.Model):
    uuid = models.UUIDField(primary_key=True, default=genOptimUUID, editable=False)
    internalStatus = models.IntegerField(default=0)
    parent = models.ForeignKey("self", blank=True, null=True, related_name="childs")
    name = models.CharField(max_length=45)
    taskType = models.ForeignKey("TaskType", null=True)
    priority = models.IntegerField()
    startDate = models.DateTimeField()
    endDate = models.DateTimeField()
    status = models.ForeignKey("ProgressionStatus")
    assignedUser = models.ForeignKey("Asset", related_name="tasksAssigned")
    asset = models.ForeignKey("Asset", related_name="tasksSubject")
    viewers = models.ManyToManyField("Asset", blank=True, related_name="followedTasks")
    step = models.ForeignKey("Step", blank=True, null=True, related_name="tasks")
    approvalWorkflow = models.ForeignKey("ApprovalWorkflow")
    linkedTasks = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="linkedTo")
    requires = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="depends")

    objects = TaskModelManager()

The number of query is fine and the database query time is fine too, for exemple if I query 700 objects of my model i have 35 query and the average query time is 100~200ms but the total request time is approximately 8 seconds.

silk times

I've run some profiling and it pointed out that more than 80% of the time spent was on the prefetch_related_objects call.

profiling

I'm using Django==1.8.5 and djangorestframework==3.4.6

I'm open to any way to optimize this. Thanks in advance for your help.


Edit with select_related:

I've tried the improvement proposed by Alasdair

class TaskModelManager(models.Manager):
    def get_queryset(self):
        return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).select_related("parent", "status", "approvalWorkflow", "step").prefetch_related("takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "viewers", "requires", "linkedTasks", "activities")

The new result is still 8 seconds for the request with 32 queries and 150ms of query time.


Edit :

It seems that a ticket was opened on Django issue tracker 4 years ago and is still open : https: //code.djangoproject.com/ticket/20577


回答1:


I ran into the same problem.

Following the issue you linked i found that you can improve the prefetch_related performance using Prefetch object and to_attr argument.

From the commit that introduces the Prefetch object:

When a Prefetch instance specifies a to_attr argument, the result is stored in a list rather than a QuerySet. This has the fortunate consequence of being significantly faster. The preformance improvement is due to the fact that we save the costly creation of a QuerySet instance.

So i significantly improved my code (from about 7 seconds to 0.88 seconds) simply by calling:

Foo.objects.filter(...).prefetch_related(Prefetch('bars', to_attr='bars_list'))

instead of

Foo.objects.filter(...).prefetch_related('bars')



回答2:


Try using select_related for foreign keys like parent and ApprovalWorkflow instead of prefetch_related.

When you use select_related, Django will fetch the models using a join, unlike prefetch_related which causes an extra query. You might find that this improves performance.




回答3:


If the DB is 150ms but your request is 8 seconds, it's not your query (in itself, at least). A few possible issues:

1) Your HTML or template is too complex, spending too much time in generating the response. Or consider template caching.

2) All those objects are complex and you load too many fields, so while the query is fast, sending it and processing all those objects in Python is slow. Explore using only(), defer() and values() or value_list() to only load what you need.

Optimization is hard and we'd need more details to give you a better idea. I'd suggest installing Django Debug Toolbar (Django app) or Opbeat (3rd party utility), they can help you detect where your time is spent and then you can optimize accordingly.



来源:https://stackoverflow.com/questions/42838010/django-prefetch-related-optimize-query-but-still-very-slow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!