Extract a region of a PDF page by coordinates

后端 未结 2 866
说谎
说谎 2021-02-06 12:44

I am looking for a tool to extract a given rectangular region (by coordinates) of a 1-page PDF file and produce a 1-page PDF file with the specified region:

# in         


        
相关标签:
2条回答
  • 2021-02-06 12:50

    using pyPdf, you could do something like this:

    import sys
    import pyPdf
    
    def extract(in_file, coords, out_file):
        with open(in_file, 'rb') as infp:
            reader = pyPdf.PdfFileReader(infp)
            page = reader.getPage(0)
            writer = pyPdf.PdfFileWriter()
            page.mediaBox.lowerLeft = coords[:2]
            page.mediaBox.upperRight = coords[2:]
            # you could do the same for page.trimBox and page.cropBox
            writer.addPage(page)
            with open(out_file, 'wb') as outfp:
                writer.write(outfp)
    
    if __name__ == '__main__':
        in_file = sys.argv[1]
        coords = [int(i) for i in sys.argv[2:6]]
        out_file = sys.argv[6]
    
        extract(in_file, coords, out_file)
    
    0 讨论(0)
  • 2021-02-06 13:00

    The following script found in http://snipplr.com/view.php?codeview&id=18924 splits each page of a pdf into 2.

    #!/usr/bin/env perl
    use strict; use warnings;
    use PDF::API2;
    
    my $filename = shift;
    my $oldpdf = PDF::API2->open($filename);
    my $newpdf = PDF::API2->new;
    
    for my $page_nb (1..$oldpdf->pages) {
      my ($page, @cropdata);
    
      $page = $newpdf->importpage($oldpdf, $page_nb);
      @cropdata = $page->get_mediabox;
      $cropdata[2] /= 2;
      $page->cropbox(@cropdata);
      $page->trimbox(@cropdata);
      $page->mediabox(@cropdata);
    
      $page = $newpdf->importpage($oldpdf, $page_nb);
      @cropdata = $page->get_mediabox;
      $cropdata[0] = $cropdata[2] / 2;
      $page->cropbox(@cropdata);
      $page->trimbox(@cropdata);
      $page->mediabox(@cropdata);
    }
    
    (my $newfilename = $filename) =~ s/(.*)\.(\w+)$/$1.clean.$2/;
    $newpdf->saveas('destination_path/myfile.pdf');
    
    0 讨论(0)
提交回复
热议问题