问题
I am using Lucene 3.5.0 to do some basic search stuff on my website. I want to store the index in a JDBC Directory in my Mysql Database. I was going to use the Compass Project to do this, but with some more research and actually trying the code I have found that Compass is a dead project and it no longer is compatible with the current version of Lucene.
Is there another option to store my index in a JDBC Directory? Is there a reason Lucene does not offer this native? Is storing on the HDD a better option for some reason?
回答1:
From the FAQ:
Lucene does not support that functionality out of the box, but several people have implemented JdbcDirectory's. The reports we have seen so far indicate that performance with such implementations is not great, but it is doable.
Another approach would be to store the index in the database as a BLOB, this could be useful if you have multiple nodes running your application. If you add a timestamp to the BLOB each node could check whether the index has been updated and recreate the index file from the DB.
回答2:
There are three questions here, which, violating the question/answer format of the forum, I will try to answer as they are related:
Q: "Is there another option to store my index in a JDBC Directory [that is compatible with Lucene 4.x]?"
A: "Google search... but no, not in widespread use. Most devs have moved to ElasticSearch, which bundles many aspects together into a much larger package. Sadly, JDBC is a dying thing as many K/V databases are taking hold. Interestingly enough, from file-systems to non-transactionally supported K/V databases ... none of these solutions are actually viable for a distributed updatable index due to unreliable operating-system file-locking or lack of ACID guarantees.
Q: "Is there a reason Lucene does not offer this native?"
A: "Ask the Lucene contributors but ... in reading their documentation, again, they have moved onto more comprehensive and commercial solutions built ontop of ElasticSearch, REST APIs and generally language-neutral implementations (versus Lucene being the underlying Java-native implementation going back decades).
Q: "Is storing on the HDD a better option for some reason?"
A: "Not recommended generally due to unreliable OS file-locking for updatable indexes and non-distribution support (envision multiple processes and nodes trying to concurrently update the same index file/s). Even AWS S3 has been shown as not viable for this purpose due to lack of locking and need to effectively delete and re-create S3 objects to effectively accomplish the task.
来源:https://stackoverflow.com/questions/8967664/lucene-jdbc-directory