Where can I find performance benchmarks for Apache Lucene/Solr

前端 未结 1 615
-上瘾入骨i
-上瘾入骨i 2021-02-13 19:12

Are there any links/resources towards performance benchmarks for Lucene/Solr on large datasets. Data sets above the range of 500GB ~ 5TB

Thanks

相关标签:
1条回答
  • 2021-02-13 19:31

    Lucene committer Mike McCandless runs benchmarks on a regular basis to track down performances improvements and regressions. They are made with Wikipedia exports, which might be a little bit smaller than what you are looking for.

    But the performance doesn't depend so much on the input size, but rather on the number of documents and unique terms. If you already have some data similar to what you will need to index, I would recommend you check out Mike's test tool, adapt it to your needs, and run it with your own dataset and hardware to try to find out what kind of performance numbers you can expect.

    0 讨论(0)
提交回复
热议问题