Where can I find performance benchmarks for Apache Lucene/Solr

余生颓废 提交于 2020-01-12 06:51:05

问题


Are there any links/resources towards performance benchmarks for Lucene/Solr on large datasets. Data sets above the range of 500GB ~ 5TB

Thanks


回答1:


Lucene committer Mike McCandless runs benchmarks on a regular basis to track down performances improvements and regressions. They are made with Wikipedia exports, which might be a little bit smaller than what you are looking for.

But the performance doesn't depend so much on the input size, but rather on the number of documents and unique terms. If you already have some data similar to what you will need to index, I would recommend you check out Mike's test tool, adapt it to your needs, and run it with your own dataset and hardware to try to find out what kind of performance numbers you can expect.



来源:https://stackoverflow.com/questions/9391305/where-can-i-find-performance-benchmarks-for-apache-lucene-solr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!