Java library for reading Word documents

后端 未结 3 2089
暗喜
暗喜 2021-01-15 07:48

Is there an open-source Java library for reading Word documents (both .docx and the older .doc format)?

Read-only access if sufficient; I do not need to modify the W

相关标签:
3条回答
  • 2021-01-15 08:36
    public class XParseTest 
    {
        public static void main(String[] args) throws XmlException, OpenXML4JException, IOException 
        {
            File file=new File("e:\\testing\\new.docx");
            FileInputStream fs = new FileInputStream(file);
            OPCPackage d = OPCPackage.open(fs);
            XWPFWordExtractor xw = new XWPFWordExtractor(d);
            System.out.println(xw.getText());    
    
        }
    
    }
    

    this will parse docx file...

    0 讨论(0)
  • 2021-01-15 08:39

    Apache POI HWPF for .doc and XWPF for .docx files

    0 讨论(0)
  • 2021-01-15 08:39

    There is an apache project that does this: http://poi.apache.org//

    0 讨论(0)
提交回复
热议问题