What's the point of Lucene NumericUtils.IntToPrefixCoded

帅比萌擦擦* 提交于 2019-12-24 07:59:25

问题


I've been looking at Subtext's Lucene.Net implementation as a guide to do something similar with our websites. When Subtext index or search for a given post, it runs the ID through NumericUtils.IntToPrefixCoded. According to the Lucene docs, it does some shifting, but doesn't lose precision. So, what's the point? What does it do, and why?


回答1:


You need to look at the class documentation, which explains it in more detail:

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.

This class generates terms to achieve this: First the numerical integer values need to be converted to strings. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting string is sortable like the original integer value. Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

As I understand, intToPrefixCoded method does exactly that: takes int value, shifts it and returns a sortable String as explained above.



来源:https://stackoverflow.com/questions/13414149/whats-the-point-of-lucene-numericutils-inttoprefixcoded

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!