发表新帖

发表新帖

Best approach for doing full-text search with list-of-integers documents

前端未结

关注

 2  1144

悲&欢浪女 2021-01-16 00:27

I\'m working on a C++/Qt image retrieval system based on similarity that works as follows (I\'ll try to avoid irrelevant or off-topic details):

I take a collection o

2条回答

北恋 (楼主)

2021-01-16 00:48
It sounds to me like you have a vectorspace model, so Lucene or a similar product may work well for you. In general, an inverted-index model will be good if:
1. You don't know the number of classes in advance
2. There are a lot of classes relative to the number of images
If your problem doesn't fit these criteria, a normal relational DB might work better, as Thomas suggested. If it meets #1 but not #2, you could investigate one of the "column oriented" non-relational databases. I'm not familiar enough with these to tell you how well they would work, but my intuition is that you'll need to replicate a lot of the functionality in an IR toolkit yourself.

Lucene is written in Java and I don't know of any C++ ports. Solr exposes Lucene as a web service, so it's easy enough to access it that way from whatever language you choose.

I don't know much about Lemur, but it looks like it has a similar vectorspace model, and it's written in C++, so that might be easier for you to use.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...

热议问题