问题
I'm playing around with a Solr-powered search for my webapp, and I figured it'd be best to use the DataImportHandler to handle syncing with the app via the database. I like the elegance of just checking the last_updated_date
field. Good stuff. However, I don't know how to handle deleting documents with this approach. The way I see it, I've got 2 choices. I could either send an explicit message to Solr from the client when a document is deleted, or I could add a "deleted" flag and leave the object in the database, so that Solr will notice that the document has changed and is now "deleted." I could add a query filter that would disregard results with the deleted flag, but it seems inefficient to include all the deleted documents in the Lucene index. What do other folks do?
回答1:
These are your options:
- Use DIH special commands $deleteDocById or $deleteDocByQuery (requires Solr 1.4+)
- Use the clean parameter of DIH to delete the whole index before importing.
- Use preImportDeleteQuery to define what's going to be cleaned up before importing. (requires Solr 1.4+)
- Use database triggers instead of DIH to manage updating the index.
- If you're using some sort of ORM use its interception capabilities instead of DIH. For example you can use hibernate events to update the index on update, insert or delete.
回答2:
I like to have a "deleted" flag so I don't actually delete my data! Depends on how paranoid you are. I like Mauricio's suggestions...
来源:https://stackoverflow.com/questions/1555610/solr-dih-how-to-handle-deleted-documents