How to add page number to a pdf file?

余生长醉 提交于 2019-12-11 00:14:08

问题


I've been trying all morning to add page numbers to a pdf document, but I can't figure it out. I'd like to use python, with pyPdf or reportlab.

Does anyone have any ideas?


回答1:


Here is my Python code to Add Page Number to PDF file. I have used both pyPdf2 and reportlab.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

helpDoc = '''
Add Page Number to PDF file with Python 
Python 给 PDF 添加 页码
usage:
    python addPageNumberToPDF.py [PDF path] 
require:
    pip install reportlab pypdf2
    Support both Python2/3, But more recommend Python3

tips:
    * output file will save at pdfWithNumbers/[PDF path]_page.pdf 
    * only support A4 size PDF
    * tested on Python2/Python3@ubuntu
    * more large size of PDF require more RAM 
    * if segmentation fault, plaese try use Python 3
    * if generate PDF document is damaged, plaese try use Python 3

Author:
    Lei Yang (ylxx@live.com)

GitHub: 
    https://gist.github.com/DIYer22/b9ede6b5b96109788a47973649645c1f
'''
print(helpDoc)

import reportlab
from reportlab.lib.units import mm
from reportlab.pdfgen import canvas

from PyPDF2 import PdfFileWriter, PdfFileReader

def createPagePdf(num, tmp):
    c = canvas.Canvas(tmp)
    for i in range(1,num+1): 
        c.drawString((210//2)*mm, (4)*mm, str(i))
        c.showPage()
    c.save()
    return 
    with open(tmp, 'rb') as f:
        pdf = PdfFileReader(f)
        layer = pdf.getPage(0)
    return layer


if __name__ == "__main__":
    pass
    import sys,os
    path = 'MLDS17f.pdf'
#    path = '1.pdf'
    if len(sys.argv) == 1:
        if not os.path.isfile(path):
            sys.exit(1)
    else:
        path = sys.argv[1]
    base = os.path.basename(path)


    tmp = "__tmp.pdf"

    batch = 10
    batch = 0
    output = PdfFileWriter()
    with open(path, 'rb') as f:
        pdf = PdfFileReader(f,strict=False)
        n = pdf.getNumPages()
        if batch == 0:
            batch = -n
        createPagePdf(n,tmp)
        if not os.path.isdir('pdfWithNumbers/'):
            os.mkdir('pdfWithNumbers/')
        with open(tmp, 'rb') as ftmp:
            numberPdf = PdfFileReader(ftmp)
            for p in range(n):
                if not p%batch and p:
                    newpath = path.replace(base, 'pdfWithNumbers/'+ base[:-4] + '_page_%d'%(p//batch) + path[-4:])
                    with open(newpath, 'wb') as f:
                        output.write(f)
                    output = PdfFileWriter()
#                sys.stdout.write('\rpage: %d of %d'%(p, n))
                print('page: %d of %d'%(p, n))
                page = pdf.getPage(p)
                numberLayer = numberPdf.getPage(p)

                page.mergePage(numberLayer)
                output.addPage(page)
            if output.getNumPages():
                newpath = path.replace(base, 'pdfWithNumbers/' + base[:-4] + '_page_%d'%(p//batch + 1)  + path[-4:])
                with open(newpath, 'wb') as f:
                    output.write(f)

        os.remove(tmp)

check the latest version at my GitHub gist




回答2:


What you want to do is very similar to watermarking a PDF, except that you are placing a different watermark on every page.

pdfrw (Disclaimer: I am the author.) will do the watermark function (and has a watermark example). You could use reportlab to programmatically create a PDF that only has the page numbering you want -- one for each page in the destination document, and then use pdfrw to overlay each page of that document on top of your original document. When you use pdfrw, you may want to reuse the original PDF trailer in order to keep bookmarks, etc. If you look at the pdfrw watermark example it will show you how to do this.

Since these are both python, you could use them from the same program and (for example) use pdfrw to figure out how many pages you need to generate from reportlab.



来源:https://stackoverflow.com/questions/31291282/how-to-add-page-number-to-a-pdf-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!