Python: How to Modify metadata of Microsoft Office files?

北城以北 提交于 2020-03-01 07:56:20

问题


How can I modify Microsoft Office Document's Metadata? I found number of result for the Jpg, PNG and PDF file. Any one can suggest Libraries for Office files Metadata?


回答1:


For newer formats they are often just zipped xml, so you can use standard libs to unzip and parse the xml. Some code to grab the document creator was previously posted as an answer on stackoverflow.

import zipfile, lxml.etree

# open zipfile
zf = zipfile.ZipFile('my_doc.docx')
# use lxml to parse the xml file we are interested in
doc = lxml.etree.fromstring(zf.read('docProps/core.xml'))
# retrieve creator
ns={'dc': 'http://purl.org/dc/elements/1.1/'}
creator = doc.xpath('//dc:creator', namespaces=ns)[0].text

For older formats you might want to look at the hachoir-metadata library



来源:https://stackoverflow.com/questions/37559769/python-how-to-modify-metadata-of-microsoft-office-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!