How to repair corrupted lucene index?

前端 未结 2 1998
广开言路
广开言路 2021-02-07 19:02

My server was power loss and lucene index was corrupted. I runned IndexChecker but it fail:

java -cp /home/dthoai/programs/paesia/checker/lucene-core-3.5.0.jar -         


        
相关标签:
2条回答
  • 2021-02-07 19:29

    It looks like the main directory file, segments_N is corrupted. This probably means that the power loss happened while a commit was running.

    If this is the case, this means that there is some chance that an older segments_N file is present in your directory, and that the referenced segments are still present and valid. If there is such a file, try to remove your corrupted segments_ls0l file and see:

    • whether Lucene manages to open the index,
    • what data you are missing.

    Otherwise, there are some threads one Lucene user mailing-list talking about regenerating the segments_N file.

    • http://www.gossamer-threads.com/lists/lucene/java-user/102493
    • http://www.gossamer-threads.com/lists/lucene/java-user/39744

    Make sure to backup your directory before performing any modification.

    0 讨论(0)
  • 2021-02-07 19:35

    I solved the corrupted lucene index as Mr. jpountz's answer.

    This is the error of our log:

    > 2020-11-11 12:52:06,119 (BasicLuceneIndexer.java:87) INFO  com.softslate.commerce.businessobjects.product.BasicLuceneIndexer - Reindexing products.
    > 2020-11-11 12:52:06,119 (BasicLuceneIndexer.java:59) INFO  com.softslate.commerce.businessobjects.product.BasicLuceneIndexer - Writing new index to: /app/etalaze_staging/apache-tomcat-8.0.17/webapps/jatis.etalaze.community/WEB-INF/lucene/new
    > 2020-11-11 12:52:06,171 (BaseRequestProcessor.java:605) WARN  com.softslate.commerce.customer.core.BaseRequestProcessor - Exception follows: 
    > org.apache.lucene.index.CorruptIndexException: checksum mismatch in segments file
    >   at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:248)
    >   at org.apache.lucene.index.IndexFileDeleter.(IndexFileDeleter.java:175)
    >   at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1109)
    >   at org.apache.lucene.index.IndexWriter.(IndexWriter.java:626)
    >   at com.softslate.commerce.businessobjects.product.BasicLuceneIndexer.getIndexWriter(BasicLuceneIndexer.java:62)
    >   at com.softslate.commerce.businessobjects.product.BasicLuceneIndexer.reindex(BasicLuceneIndexer.java:88)
    >   at com.softslate.commerce.administrator.product.LuceneAddAllAction.execute(LuceneAddAllAction.java:44)
    >   at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:425)
    >   at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:228)
    >   at org.apache.struts.action.ActionServlet.process(ActionServlet.java:1913)
    >   at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:462)
    >   at javax.servlet.http.HttpServlet.service(HttpServlet.java:644)
    >   at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)
    >   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
    >   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    >   at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
    >   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    >   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    >   at com.softslate.commerce.administrator.core.AdministratorFilter.doFilter(AdministratorFilter.java:44)
    >   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    >   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    >   at com.softslate.commerce.customer.core.SEOFilter.doFilter(SEOFilter.java:92)
    >   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    >   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    >   at com.softslate.commerce.customer.core.HibernateFilter.doFilter(HibernateFilter.java:75)
    >   at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
    >   at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    >   at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
    >   at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
    >   at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
    >   at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
    >   at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
    >   at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:610)
    >   at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)
    >   at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:516)
    >   at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1086)
    >   at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:659)
    >   at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
    >   at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1558)
    >   at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1515)
    >   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    >   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    >   at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    >   at java.lang.Thread.run(Thread.java:745)
    

    I want to tell you the story behind it for more understanding. On 27 October, 2020, our office had a power loss at 11:18 am. I think it was the cause of corrupted lucene index. Maybe, there was commit failure.

    Everytime we commited reindexing, it produced above error and created new segments. It reapeated over and over until 11 November, 2020. Inside the directory lucene/new, there had been 44 segements file (e.g segments_1, segments_2, segments_3, ... segments_N).

    Solution: I did backup on the lucene folder. Then deleted all the segements_N files except the newest segments_N file and segments.gen. I keep that two files.

    Finally, the error doesn't show again. And everything works as before.

    0 讨论(0)
提交回复
热议问题