Merge PDF's with PDFTK with Bookmarks?

前端 未结 11 951
遥遥无期
遥遥无期 2020-12-04 08:51

Using pdftk to merge multiple pdf\'s is working well. However, any easy way to make a bookmark for each pdf merged?

I don\'t see anything on the pdftk docs regardin

相关标签:
11条回答
  • 2020-12-04 09:24

    Unfortunately there is no easy way to do that. You could use the library that pdftk is built upon directly and either write a Java or a .NET program that uses iText or iTextSharp to merge your one-pagers and create the bookmarks. If you want to go the iText route, there are lot of examples available online or in the iText book (written by the iText author).

    ... or, let me know what's not working and I can help.

    0 讨论(0)
  • 2020-12-04 09:25

    I know there are other ways to do this already mentioned, but with pdftk you can take the merged pdf and add bookmarks to it by using the pdftk function dump_data to create a .info file of the existing info in the pdf. Then you can add bookmark info to the .info file by add the following four lines for each bookmark

    BookmarkBegin
    BookmarkTitle: name
    BookmarkLevel: level
    BookmarkPageNumber: page number
    

    Then use the update_info call to update the merged pdf bookmarks with the ones you wrote to the .info file. I have written some simple functions that do this for me in autohotkey if anyone is interested. See http://www.autohotkey.com/board/topic/98985-scripts-to-merge-pdfs-and-add-bookmarks-with-pdftk/

    0 讨论(0)
  • 2020-12-04 09:26

    Maybe the following is helpful. I wanted to merge all pdfs (in_nn.pdf) located in one directory to one out.pdf which has the names of input pdfs (in_nn) as ToC. I wrote a python script which reads the names and extracts the page numbers and generates a file named pdfmarks. Merging the files is then easily done with gs. The exact command is output by the script and must be executed separately (maybe with some modifications due to page size adaptions or due to the operating system).

    Here it is. Perhaps some modifications are necessary for windows? (sorry for comments not in english). Just execute the python script in the directory where the pdfs to be merged are located.

    #!/usr/bin/env python
    
    import subprocess
    
    # Dieses Skript dient dazu, eine Reihe von pdfs zu einem einzigen pdf zusammenzufassen und bookmarks fuer diese pdf-Datei zu erzeugen.
    # Dafuer wird ein Datei pdfmark benoetigt, die mit diesem Skript erzeugt wird.
    # Dazu einfach dieses Skript in dem Verzeichnis aufrufen, das genau alle zusammenzufassenden pdfs (*pdf, s.u.) enthaelt.
    # Das zusammenfassende pdf wird dann mit diesem Befehl (in der bash) generiert:
    # gs -dBATCH -dNOPAUSE -sPAPERSIZE=A4 -sDEVICE=pdfwrite -sOutputFile="all.pdf" $(ls *pdf ) pdfmarks
    # Bereits Inhaltsverzeichnisse bleiben erhalten, die neuen kommen ans Ende des Inhaltsverzeichnisses.
    #
    # pdfmarks sieht dabei prinzipiell so aus:
    #
    # [/Title (Nr. 1) /Page 1 /OUT pdfmark
    # [/Title (Nr. 2) /Page 5 /OUT pdfmark
    # [/Title (Nr. 3) /Page 9 /OUT pdfmark
    # usw.
    
    p = subprocess.Popen('ls *pdf', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
    
    pdfdateien = []
    kombinationen = []
    
    for line in p.stdout.readlines():
    # p enthaelt alle pdf-Dateinamen
      pdfdateien.append(line)
    
    
    for datei in pdfdateien:
      cmd = "pdfinfo %s" %datei 
      q=subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
      kombination = [datei]
    
    for line in p.stdout.readlines():
    # p enthaelt alle pdf-Dateinamen
      pdfdateien.append(line)
    
    
    for datei in pdfdateien:
      cmd = "pdfinfo %s" %datei 
      q=subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
      kombination = [datei]
    
    
      for subline in q.stdout.readlines():
    # q enthaelt die Zeilen von pdfinfo
        if "Pages" in subline:
          kombination.append(subline)
    
      kombinationen.append(kombination)
    
    
    # Jetzt kombinationen in benoetigtes Format bringen:
    
    kombinationen_bereinigt =  []
    out_string1 = "[/Title ("
    out_string2 = ") /Page "
    out_string3 = " /OUT pdfmark\n"
    seitenzahl = 1
    
    for kombination in kombinationen:
      dateiname = kombination[0][0:len(kombination[0])-5]
    
    #
    # Hier noch dateiname evtl. verwursten
    # z. B.
    #  lesezeichen = dateiname[0:1]+" "+dateiname[6:8]+"/"+dateiname[1:5]
      lesezeichen = dateiname
    
      anz_seiten = kombination[1][16:len(kombination[1])-1]
      seitenzahl_str = str(seitenzahl)
    
      kombination_bereinigt = out_string1+lesezeichen+out_string2+seitenzahl_str+out_string3
      kombinationen_bereinigt.append(kombination_bereinigt)
    
      seitenzahl += int(anz_seiten)
    
    
    # Ausgabe ins file
    outfile = open("pdfmarks", "w")
    
    for i in kombinationen_bereinigt:
      outfile.write(i)
    
    outfile.close()
    
    # Merge-Befehl absetzen
    
    print "\nFor merging all pdfs execute this (or similar) command (in bash shell):"
    print "gs -dBATCH -dNOPAUSE -sPAPERSIZE=A4 -sDEVICE=pdfwrite -sOutputFile=\"all.pdf\" $(ls *pdf ) pdfmarks\n"
    
    0 讨论(0)
  • 2020-12-04 09:29

    Too add or edit pdf bookmarks you could use JPdfBookmarks. It is an excellent multi-OS Free Software tool that I have been using for a while now with excellent results. It deals with bookmarks only though, so you would need another tool to merge or reorder pages. In addition to pdftk I suggest trying PDF Split and Merge (good app, but weird UI, messes up bookmarks from my experience), PDF-Shuffler (seems to work fine, but sometimes freezes while dealing with some files), or PdfMod (the best potentially as it deals with rearranging, merging and dealing with bookmarks, although I have not been able to figure out how to add pdfs into a specific page).

    Sorry for not providing some links, as a newbie the system only allows me to add 2 hyperlinks.

    0 讨论(0)
  • 2020-12-04 09:31

    The recent version of pdftk (at least v2.02) handles bookmarks and links correctly:

    pdftk file1.pdf file2.pdf cat output merged.pdf
    
    0 讨论(0)
  • 2020-12-04 09:32

    @pipitas 's good answer doesn't solve the bookmark issues perfected, and the there is related question in unix discussion https://unix.stackexchange.com/questions/17065/add-and-edit-bookmarks-to-pdf/31070 , where I suggest

    If you still stick with those unix scripts, then

    1. extract bookmark data dumped from pdftk
    2. write one extra script to convert dumped bookmark data to pdfmarks format, which ghostscript command gs is accepted.
    3. use gs script to merge them together with pdfmarks

    The script exist already, see pdf-merge.py from Merge PDF's with PDFTK with Bookmarks?

    0 讨论(0)
提交回复
热议问题