Adding URL parameter to Nutch/Solr index and search results

后端 未结 1 1260
伪装坚强ぢ
伪装坚强ぢ 2021-02-06 17:06

I can\'t find any hint on how to setup nutch to NOT filter/remove my URL parameters. I want to crawl and index some pages where lots of content is hidden behind the same base UR

1条回答
  •  囚心锁ツ
    2021-02-06 17:53

    You could create a custom field in a Nutch filter to save the entire URL. As long as you define the same field in the Solr schema with store="true" it will show up in your results. See WritingPluginExample-1.2.

    Let me know if you'd like some help.

    0 讨论(0)
提交回复
热议问题