escaping characters for substitution into a PDF

后端 未结 2 2062
我寻月下人不归
我寻月下人不归 2021-02-11 04:57

Can anyone tell me the set of control characters for a PDF file, and how to escape them? I have a (non-deflated (inflated?)) PDF document that I would like to edit the text in,

2条回答
  •  爱一瞬间的悲伤
    2021-02-11 05:37

    You likely already know this, but PDF files have an index at the end that contains byte offsets to everything in the document. If you edit the doc by hand, you must ensure that the new text you write has exactly the same number of characters as the original.

    If you want to extract PDF page content and edit that, it's pretty straightforward. My CAM::PDF library lets you do it programmatically or via the command line:

     use CAM::PDF;
     my $pdf = CAM::PDF->new($filename);
     my $page_content = $pdf->getPageContent($pagenum);
     # ...
     $pdf->setPageContent($pagenum, $page_content)l
     $pdf->cleanoutput($out_filename);
    

    or

     getpdfpage.pl in.pdf 1 > page1.txt
     setpdfpage.pl in.pdf page1.txt 1 out.pdf
    

提交回复
热议问题