问题
I'm creating a web application to search people with their properties such as education, experience, etc. I can't use full-text-search for all the fields, because, some has to be a fuzzy match. (Eg: if we search for biotech, it should pick bio tech, biotech and also bio-tech). My database has about 200 entries in the profile model, which is to appear in the search results.
Other models like education and experience are connected to profile through foreign key
Therefore, I decided to be selective on what method to use on what field. For shorter fields like degree name (In the Education model) I want to use trigram similarity. For fields like education description, I use Full-text search.
However, Since I have to do this in multiple fields, I used simple lookups instead of using search vectors.
Profile.objects.filter(
Q(first_name__trigram_similar=search_term) |
Q(last_name__trigram_similar=search_term) |
Q(vision_expertise__search=search_term) |
Q(educations__degree__trigram_similar=search_term) |
Q(educations__field_of_study__trigram_similar=search_term) |
Q(educations__school__trigram_similar=search_term) |
Q(educations__description__search=search_term) |
Q(experiences__title__trigram_similar=search_term) |
Q(experiences__company__trigram_similar=search_term) |
Q(experiences__description__search=search_term) |
Q(publications__title__trigram_similar=search_term) |
Q(publications__description__search=search_term) |
Q(certification__certification_name__trigram_similar=search_term) |
Q(certification__certification_authority__trigram_similar=search_term) |
Q(bio_description__search=search_term) |
)
I get the expected results on every search. However, the time it takes to get it is ridiculously slow. I can't figure it out how to make this faster.
回答1:
Without the class code it's difficult to find the better way to optimize your query.
You can add a Gin or Gist index to speed up the trigram similarity.
You can build an annotation with the SearchVector as below:
from django.contrib.postgres.aggregates import StringAgg
from django.contrib.postgres.search import SearchQuery, SearchVector
search_vectors = (
SearchVector('vision_expertise') +
SearchVector('bio_description') +
SearchVector(StringAgg('experiences__description', delimiter=' ')) +
SearchVector(StringAgg('educations__description', delimiter=' ')) +
SearchVector(StringAgg('publications__description', delimiter=' '))
)
Profile.objects.annotate(
search=search_vectors
).filter(
Q(search=SearchQuery(search_term)) |
Q(first_name__trigram_similar=search_term) |
Q(last_name__trigram_similar=search_term) |
Q(educations__degree__trigram_similar=search_term) |
Q(educations__field_of_study__trigram_similar=search_term) |
Q(educations__school__trigram_similar=search_term) |
Q(experiences__title__trigram_similar=search_term) |
Q(experiences__company__trigram_similar=search_term) |
Q(publications__title__trigram_similar=search_term) |
Q(certification__certification_name__trigram_similar=search_term) |
Q(certification__certification_authority__trigram_similar=search_term)
)
You can speed-up the full-text search using a SearchVectorField
To find out about full-text search and trigram you can read the article I wrote on the subject:
"Full-Text Search in Django with PostgreSQL"
来源:https://stackoverflow.com/questions/56538419/poor-performance-when-trigram-similarity-and-full-text-search-were-combined-with