Reading the PDF properties/metadata in Python
问题 How can I read the properties/metadata like Title, Author, Subject and Keywords stored on a PDF file using Python? 回答1: Try pdfminer: from pdfminer.pdfparser import PDFParser from pdfminer.pdfdocument import PDFDocument fp = open('diveintopython.pdf', 'rb') parser = PDFParser(fp) doc = PDFDocument(parser) print(doc.info) # The "Info" metadata Here's the output: >>> [{'CreationDate': 'D:20040520151901-0500', 'Creator': 'DocBook XSL Stylesheets V1.52.2', 'Keywords': 'Python, Dive Into Python,