Optimizing Solr for Sorting

后端 未结 3 569
一生所求
一生所求 2021-02-04 18:36

I\'m using Solr for a realtime search index. My dataset is about 60M large documents. Instead of sorting by relevance, I need to sort by time. Currently I\'m using the sort flag

3条回答
  •  灰色年华
    2021-02-04 19:38

    Warning: Wild suggestion, not based on prior experience or known facts. :)

    1. Perform a query without sorting and rows=0 to get the number of matches. Disable faceting etc. to improve performance - we only need the total number of matches.
    2. Based on the number of matches from Step #1, the distribution of your data and the count/offset of the results that you need, fire another query which sorts by date and also adds a filter on the date, like fq=date:[NOW()-xDAY TO *] where x is the estimated time period in days during which we will find the required number of matching documents.
    3. If the number of results from Step #2 is less than what you need, then relax the filter a bit and fire another query.

    For starters, you can use the following to estimate x:

    If you are uniformly adding n documents a day to the index of size N documents and a specific query matched d documents in Step #1, then to get the top r results you can use x = (N*r*1.2)/(d*n). If you have to relax your filter too often in Step #3, then slowly increase the value 1.2 in the formula as required.

提交回复
热议问题