问题
I have a Solr index with about 2.5M items in it and I am trying to use an ExternalFileField to boost relevancy. Unfortunately, it's VERY slow when I try to do this, despite it being a beefy machine and Solr having lots of memory available.
In the external file I have contents like:
747501=3.8294805903e-07
747500=3.8294805903e-07
1718770=4.03292174724e-07
1534562=3.8294805903e-07
1956010=3.8294805903e-07
747509=3.8294805903e-07
747508=3.8294805903e-07
1718772=3.8294805903e-07
1391385=3.8294805903e-07
2089652=3.8294805903e-07
1948271=3.8294805903e-07
108368=3.84404072186e-06
Each line is a document ID and it's corresponding boosting factor.
In my query I'm using edismax, and I am using the boost parameter, setting it to pagerank. The entire query is here.
In my schema I have:
<!-- External File Field Type-->
<fieldType name="pagerank"
keyField="id"
stored="false"
indexed="true"
omitNorms="false"
class="solr.ExternalFileField"
valType="float"/>
and
<field name="pagerank"
type="pagerank"
indexed="true"
stored="true"
omitNorms="false"/>
But the performance is just, plain bad. Am I missing a setting or something?
回答1:
According to the javadoc
The external file may be sorted or unsorted by the key field, but it will be substantially slower (untested) if it isn't sorted.
And as I see, ids in your file are unsorted. Can you sort it and test if it helps?
来源:https://stackoverflow.com/questions/19801294/relevancy-boosting-very-slow-in-solr