Getting the Doc ID in Lucene

房东的猫 提交于 2019-12-31 01:56:06

问题


In lucene, I can do the following

doc.GetField("mycustomfield").StringValue();

This retrieves the value of a column in an index's document.

My question, for the same 'doc', is there a way to get the Doc. Id ? Luke displays it hence there must be a way to figure this out. I need it to delete documents on updates.

I scoured the docs but have not found the term to use in GetField or if there already is another method.


回答1:


Turns out you have to do this:

var hits = searcher.Search(query);
var result = hits.Id(0);

As opposed to

var results = hits.Doc(i);
var docid = results.<...> //there's nothing I could find there to do this



回答2:


I suspect the reason you're having trouble finding any documentation on determining the id of a particular Lucene Document is because they are not truly "id"s. In other words, they are not necessarily meant to be looked up and stored for later use. In fact, if you do, you will not get the results you were hoping for, as the IDs will change when the index is optimized.

Instead, think of the IDs as the current "offset" of a particular document from the start of the index, which will change when deleted documents are physically removed from the index files.

Now with that said, the proper way to look up the "id" of a document is:


QueryParser parser = new QueryParser(...);
IndexSearcher searcher = new IndexSearcher(...);
Hits hits = searcher.Search(parser.Parse(...);

for (int i = 0; i < hits.Length(); i++)
{
   int id = hits.Id(i);

   // do stuff
}


来源:https://stackoverflow.com/questions/1296709/getting-the-doc-id-in-lucene

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!