问题
How can I modify Microsoft Office Document's Metadata? I found number of result for the Jpg, PNG and PDF file. Any one can suggest Libraries for Office files Metadata?
回答1:
For newer formats they are often just zipped xml, so you can use standard libs to unzip and parse the xml. Some code to grab the document creator was previously posted as an answer on stackoverflow.
import zipfile, lxml.etree
# open zipfile
zf = zipfile.ZipFile('my_doc.docx')
# use lxml to parse the xml file we are interested in
doc = lxml.etree.fromstring(zf.read('docProps/core.xml'))
# retrieve creator
ns={'dc': 'http://purl.org/dc/elements/1.1/'}
creator = doc.xpath('//dc:creator', namespaces=ns)[0].text
For older formats you might want to look at the hachoir-metadata library
来源:https://stackoverflow.com/questions/37559769/python-how-to-modify-metadata-of-microsoft-office-files