Merge PDF's with PDFTK with Bookmarks?

前端 未结 11 952
遥遥无期
遥遥无期 2020-12-04 08:51

Using pdftk to merge multiple pdf\'s is working well. However, any easy way to make a bookmark for each pdf merged?

I don\'t see anything on the pdftk docs regardin

相关标签:
11条回答
  • 2020-12-04 09:35

    Sejda PDF (which was suggested in one of the answers) is also available as an online service: https://www.sejda.com/merge-pdf.

    This might come in handy if you don't want to install any additional software and prefer working online from a browser.

    Steps to merge:

    1. Drag and drop all PDF files to the web page
    2. By default all existing bookmarks are preserved and will work in the merged document as well.

    3. Optionally, the merge tool can build a table of contents based on the PDF documents being combined

    The online service to merge PDF files is free to use for up to 30 files per hour and files up to 50Mb/200 pages.

    Disclaimer: I'm an open source dev working on Sejda.

    0 讨论(0)
  • 2020-12-04 09:38

    You can also merge multiple PDFs with Ghostscript. The big advantage of this route is that a solution is easily scriptable, and it does not require a real programming effort:

    gswin32c.exe ^
              -dBATCH -dNOPAUSE ^
              -sDEVICE=pdfwrite ^
              -sOutputFile=merged.pdf ^
              [...more Ghostscript options as needed...] ^
              input1.pdf input2.pdf input3.pdf [....]
    

    With Ghostscript you'll be able to pass pdfmark statements which can add a Table of Content as well as bookmarks for each additional source file going into the resulting PDF. For example:

    gswin32c.exe ^
              -dBATCH -dNOPAUSE ^
              -sDEVICE=pdfwrite ^
              -sOutputFile=merged.pdf ^
              [...more Ghostscript options as needed...] ^
              file-with-pdfmarks-to-generate-a-ToC.ps ^
              -f input1.pdf input2.pdf input3.pdf [....]
    

    or

    gswin32c.exe ^
              -dBATCH -dNOPAUSE ^
              -sDEVICE=pdfwrite ^
              -sOutputFile=merged.pdf ^
              [...more Ghostscript options as needed...] ^
              file-with-pdfmarks-to-generate-a-ToC.ps ^
              -f input1.pdf ^
                 input2.pdf ^ 
                 input3.pdf [....]
    

    For some introduction to the pdfmark topic, see also Thomas Merz's PDFmark Primer.


    Edit:
    I had wanted to give you an example for file-with-pdfmarks-to-generate-a-ToC.ps, but somehow forgot it. Here it is:

    [/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark
    [/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark
    [/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark
    [/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark 
    

    This would create a ToC for the first 4 files == first 4 pages (since you guarantee your ingredient files are 1 page each for your merged output PDF).

    1. The [/XYZ null null null] part makes sure your page viewport and zoom level does not change from the current one when you follow the link. (You could say [/XYZ 222 111 2] to do this, if you want an arbitrary example.)
    2. The /Title (some string you want) thingie determines what text is in the ToC.

    And, you could even add these parameters to the Ghostscript commandline directly:

    gswin32c.exe ^
           -o merged.pdf ^
           [...more Ghostscript options as needed...] ^
           -c "[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark" ^
           -c "[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark" ^
           -c "[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark" ^
           -c "[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark" ^
           -f input1.pdf ^
              input2.pdf ^ 
              input3.pdf ^ 
              input4.pdf [....]
    



    'nother Edit:

    Oh, and by the way: Ghostscript does preserve the bookmarks when you use it to merge two PDF files into one -- pdftk.exe does not. Let's use the one generated by the command of my first edit (effectively concatenating 2 copies of the same file):

     gswin32c ^
        -sDEVICE=pdfwrite ^
        -o doublemerged.pdf ^
         merged.pdf ^
         merged.pdf
    

    The file doublemerged.pdf will now have 2*4 = 8 bookmarks.

    • What's as expected: bookmarks 1, 2, 3, and 4 link to pages 1, 2, 3 and 4.
    • The problem is, that bookmarks 5, 6, 7 and 8 also link at pages 1, 2, 3 and 4.

    The reason is, that the pre-existing bookmarks did address their link targets by absolute page numbers. To work around that (and bookmarks work in merged files), one would have to generate bookmarks which do point to link targets by named destinations (and make sure these are uniq across documents which are merged).

    (This approach also works on linux, just use gs instead of gswin32c.)


    Appendix

    Above command line uses [...more Ghostscript options as needed...] as a place holder for more options.

    If you do not use other options, Ghostscript will apply its built-in defaults for various parameters. However, this may give you results which may not to your liking. Since Ghostscript generates a completely new PDF based on the input, this means that some of the original objects may be changed. This is true for color spaces and for image compression levels.

    How to apply parameters which leave the originally embedded images unchanged can be seen over at SuperUser: "Use Ghostscript, but tell it to not reprocess images".

    0 讨论(0)
  • 2020-12-04 09:41

    See this answer at https://stackoverflow.com/a/17781138/547578. I used something called Sejda. It works. It combines the bookmarks perfectly. Thanks @blablatros.

    0 讨论(0)
  • 2020-12-04 09:44

    There is PdfMod. It has a graphical interface and it let you add bookmarks manually. Also if you edit a PDF that already comes with bookmarks, it will update them automatically to point to the correct pages.

    0 讨论(0)
  • 2020-12-04 09:49

    The following is intended to be a comment to the answer by pdfmerger (https://stackoverflow.com/a/30524828/3915004).

    Thanks for your script pdfmerger! I know the question is marked linux, but to generalize your script for Mac OS X, 2 things are needed:

    • ghostscript gs and
    • the command pdfinfo (which is included e.g. in poppler)

    Install them by getting first brew (google it, it is installed via some curl/ruby-magic command ^^ ) and then simply:

    brew install ghostscript
    brew install poppler
    

    ADD-ON: READ TEXT-FILE WITH CHAPTER TITLES:

    To expand on your script. I use this workflow mainly for books available as chapter-downloads from the editors website. A textfile containing the chapter names can easily be generated. The following add-on to your code reads additionally a textfile 'chapters.txt' containing one line per pdf to merge. (Note, I didn't implement any check on the number of lines corresponding to the number of pdfs.)

    Simply expand your script by replacing the following lines:

    p = subprocess.Popen('ls *pdf', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    c = subprocess.Popen('less chapters.txt', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    
    pdfdateien = []
    kombinationen = []
    chapternames = []
    
    for line in c.stdout.readlines():
    # c contains all chapter-titles
      chapternames.append(line)
    
    for line in p.stdout.readlines():
    

    and

    for index, kombination in enumerate(kombinationen):
    #  dateiname = kombination[0][0:len(kombination[0])-5]
    #
    # Hier noch dateiname evtl. verwursten
    # z. B.
    #  lesezeichen = dateiname[0:1]+" "+dateiname[6:8]+"/"+dateiname[1:5]
    #  lesezeichen = dateiname
      lesezeichen=chapternames[index][:-1]
    
      anz_seiten = kombination[1][16:len(kombination[1])-1]
    
    0 讨论(0)
提交回复
热议问题