Solr powered Tag Cloud

痞子三分冷 提交于 2019-12-01 23:20:12

问题


I seem to be stuck behind the logic of a Solr faceting-powered tag cloud. First of all, I'm using OpenNLP to parse my docs and obtain relevant words out of it, so every single document gets split into n number of words. And here's basically what my Solr response looks like:

<docID>
<title>My Doc Title</title>
<content>My Doc Title</content>
<date_published>My Doc Title</date_published>
</docID>

I believe there must be a way to integrate the words in here. I first thought of something like this:

<docID>
<title>My Doc Title</title>
<content>My Doc Title</content>
<date_published>My Doc Title</date_published>
<words>word</words>
<words1>word1</words1>
<words2>word2</words2>
<words3>word3</words3>
<wordsN>wordN</wordsN>
</docID>

But the faceting wouldn't be possible, as i have no idea how many words fields i would get per docID, then the faceting would have to be done across fields (which i;m not even sure it;s possible). I am trying to look into possible answers but I seem to be stuck... at the end, i need to make a faceting of n words that would get each single doc I have in my index. Thoughts would highly be appreciated.


回答1:


I would suggest using a single words field that is multivalued and stores the list of words per document.

having unbound number of word\d+ fields will complicate things.

if you use a single words multivalued field you can get all the words along with their frequencies which should be enough for creating the tag cloud.



来源:https://stackoverflow.com/questions/5737286/solr-powered-tag-cloud

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!