Search queries in neo4j: how to sort results in neo4j in START query with internal TFIDF / levenshtein or other algorithms?

北城余情 提交于 2019-12-30 05:32:05

问题


I am working on a model using wikipedia topics' names for my experiments in full-text index.

I set up and index on 'topic' (legacy), and do a full text search for : 'united states':

start n=node:topic('name:(united states)') return n

The first results are not relevant at all:

'List of United States National Historic Landmarks in United States commonwealths and territories, associated states, and foreign states'

[...]

and the actual 'united states' is buried deep down the list.

As such, it raises the problem that, in order to find the best match (e.g. levershtein, bi-gram, and so on algorithms) on results, you first must fetch all the items matching the pattern.

That would be a serious constraint, cause just in this case I have 21K rows, ~4 seconds.

Which algorithms does neo4j use to order the results of a full-text search (START)? Which rationale does it use to sort result and how to change it using cypher? In the doc is written to use JAVA api to apply sort() - it would be very useful to have a tutorial for appointing to which files to modify and also to know which ranking rationale is used before any tweak.

EDITED based on comments below - pagination of results is possible as: n=node:topic('name:(united states)') return n skip 10 limit 50;

(skip before limit) but I need to ensure first results are meaningful before pagination.


回答1:


I don't know which order algorithms does lucene use to order the results. However, about the pagination, if you change the order of limit and skip like follows, should be ok. start n=node:topic('name:(united states)') return n skip 10 limit 50 ;

I would also add that if you are performing full-text search maybe a solution like solr is more appropriate.




回答2:


For just a lucene index lookup with scoring you might be better off with this:

http://neo4j.com/docs/stable/rest-api-indexes.html#rest-api-find-node-by-query



来源:https://stackoverflow.com/questions/31862761/search-queries-in-neo4j-how-to-sort-results-in-neo4j-in-start-query-with-intern

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!