Apache Jackrabbit JCA 2.7.5 .docx and .xlsx indexing

前端 未结 2 1612
一个人的身影
一个人的身影 2020-12-17 09:15

I\'m ussing the Appache Jackrabbit JCA 2.7.5, the problem is that files .docx and .xlsx is not indexed.

My steps :

  • Deploy the Jackrabbit JCA as
相关标签:
2条回答
  • 2020-12-17 09:54

    The solution is focused on JARs of the jackrabbit-jca-2.7.5.rar!

    There are errors on dependency so I make these change :

    • add apache-mime4j-0.6.jar
    • add apache-mime4j-core-0.7.jar
    • add commons-compress-1.5.jar

    Add these JARs in the jackrabbit-jca-2.7.5.rar before deploying this!

    And the indexation of .docx, .xlsx, ... wors succesfully!

    Thank you for @Ashok Felix

    0 讨论(0)
  • 2020-12-17 09:59

    Ref: http://jackrabbit.510166.n4.nabble.com/Office-2007-documents-not-being-indexed-in-Jackrabbit-2-4-3-td4657380.html

    On the same line, I have observed commons-compress-1.5.jar is required by Tika parser in case of OOXML types of documents (i.e. office 2007 documents).

    Now, I am able to index & search most of types of documents (office 2007 - docx, pptx, xlsx , office 2003 - doc, ppt, xls, PDF) using below 2 steps:

    (1) Updated repository.xml & added Further details can be found at https://issues.apache.org/jira/browse/JCR-3287

    (2) Added commons-compress-1.5.jar classpath while running jackrabbit-standalone-2.6.2.jar

    0 讨论(0)
提交回复
热议问题