How to extract text from an existing docx file using python-docx

后端 未结 7 1050
不思量自难忘°
不思量自难忘° 2020-11-27 15:59

I\'m trying to use python-docx module (pip install python-docx) but it seems to be very confusing as in github repo test sample they are using

相关标签:
7条回答
  • 2020-11-27 16:45

    There are two "generations" of python-docx. The initial generation ended with the 0.2.x versions and the "new" generation started at v0.3.0. The new generation is a ground-up, object-oriented rewrite of the legacy version. It has a distinct repository located here.

    The opendocx() function is part of the legacy API. The documentation is for the new version. The legacy version has no documentation to speak of.

    Neither reading nor writing hyperlinks are supported in the current version. That capability is on the roadmap, and the project is under active development. It turns out to be quite a broad API because Word has so much functionality. So we'll get to it, but probably not in the next month unless someone decides to focus on that aspect and contribute it. UPDATE Hyperlink support was added subsequent to this answer.

    0 讨论(0)
提交回复
热议问题