How to improved query performance in Django admin search on related fields (MySQL)

后端 未结 2 1184
情深已故
情深已故 2021-01-13 14:34

In Django I have this:

models.py

class Book(models.Model):
    isbn = models.CharField(max_length=16, db_index=True)
    title = mod         


        
相关标签:
2条回答
  • 2021-01-13 15:04

    You can redefine get_changelist for ModelAdmin subclass and try to optimize query manually there. For example, ISBN can be looked up with exact match instead of icontains, and you can add subqueries on Book to work faster.

    0 讨论(0)
  • 2021-01-13 15:08

    After a lot of investigations I found that the problem come from how the search query is built for the admin search field (in the ChangeList class). In a multi-terms search (words separated by space) each term is added to the QuerySet by chaining a new filter(). When there's one or more related fields in the search_fields, the created SQL query will have a lot of JOIN chained one after the other with many JOIN for each related field (see my related question for some examples and more info). This chain of JOIN is there so that each term will be search only in the subset of data filter by the precedent term AND, most important, that a related field need to only have one term (vs needing to have ALL terms) to make a match. See Spanning multi-valued relationships in the Django docs for more info on this subject. I'm pretty sure it's the behavior wanted most of the time for the admin search field.

    The drawback of this query (with related fields involved) is that the variation in performance (time to perform the query) can be really large. It depends on a lot of factors: number of searched terms, terms searched, kind of field search (VARCHAR, etc.), number of field search, data in the tables, size of the tables, etc. With the right combination it's easy to have a query that will take mostly forever (a query that take more then 10 min. for me is a query that take forever in the context of this search field).

    The reason why it can take so long is that the database need to create a temporary table for each term and scan it mostly entirely to search for the next term. So, this adds up really quickly.

    A possible change to do to improve the performance is to ANDed all terms in the same filter(). This way their will be only one JOIN by related field (or 2 if it's a many to many) instead of many more. This query will be a lot faster and with really small performance variation. The drawback is that related fields will have to have ALL the terms to match, so, you can get less matches in many cases.

    UPDATE

    As asked by trinchet here’s what’s needed to do the change of search behavior (for Django 1.7). You need to override the get_search_results() of the admin classes where you want this kind of search. You need to copy all the method code from the base class (ModelAdmin) to your own class. Then you need to change those lines:

    for bit in search_term.split():
        or_queries = [models.Q(**{orm_lookup: bit})
                      for orm_lookup in orm_lookups]
        queryset = queryset.filter(reduce(operator.or_, or_queries))
    

    To that:

    and_queries = []
    for bit in search_term.split():
        or_queries = [models.Q(**{orm_lookup: bit})
                      for orm_lookup in orm_lookups]
        and_queries.append(Q(reduce(operator.or_, or_queries)))
    queryset = queryset.filter(reduce(operator.and_, and_queries))
    

    This code is not tested. My original code was for Django 1.4 and I just adapt it for 1.7 here.

    0 讨论(0)
提交回复
热议问题