pdf-manipulation

What steps to copy text from one PDF to another PDF?

给你一囗甜甜゛ 提交于 2020-04-30 07:04:18
问题 In How to transfer OCR text from one PDF to another PDF one answerer says that copy text from one PDF to another is not a trivial undertaking, though possible. If so, then what steps does it need to do so? For the sake of the guide, let's assume that the source file and the target file have identical metadata. 来源: https://stackoverflow.com/questions/61066590/what-steps-to-copy-text-from-one-pdf-to-another-pdf

What steps to copy text from one PDF to another PDF?

微笑、不失礼 提交于 2020-04-30 07:03:49
问题 In How to transfer OCR text from one PDF to another PDF one answerer says that copy text from one PDF to another is not a trivial undertaking, though possible. If so, then what steps does it need to do so? For the sake of the guide, let's assume that the source file and the target file have identical metadata. 来源: https://stackoverflow.com/questions/61066590/what-steps-to-copy-text-from-one-pdf-to-another-pdf

What steps to copy text from one PDF to another PDF?

吃可爱长大的小学妹 提交于 2020-04-30 07:03:23
问题 In How to transfer OCR text from one PDF to another PDF one answerer says that copy text from one PDF to another is not a trivial undertaking, though possible. If so, then what steps does it need to do so? For the sake of the guide, let's assume that the source file and the target file have identical metadata. 来源: https://stackoverflow.com/questions/61066590/what-steps-to-copy-text-from-one-pdf-to-another-pdf

Documentation for using JavaScript code inside a PDF file [closed]

若如初见. 提交于 2020-04-24 04:21:47
问题 Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . Where can I find documentation on running JavaScript code inside a PDF? I've never added a JavaScript action inside a PDF document. However, I've done quite a bit of web development using JavaScript. I have a few questions to whoever has any familiarity with JavaScript inside a PDF

Calculating the exact positions of(Td, TD, Tm, cm, T*) content stream in pdf?

那年仲夏 提交于 2020-02-24 05:00:08
问题 Getting or calculating the exact positions of(Td, TD, Tm, cm, T*) content stream in pdf? As a human I am able to calculate(whether it is replacing last Td or adding to last Td or multiplication with fontsize) the positions of tags in pdf content stream by comparing , where the glyphs are located in pdf and content stream position values. But I am unable to calculate perfect positions of glyph's programatically . Please see the screen short. In above image left side box is pdf ui glyphs and

Manipulating PDF file

半城伤御伤魂 提交于 2020-01-06 06:26:48
问题 I would like to read a PDF file as a text (postscript), add new objects in the file structure and save the final output as a new PDF but If I just copied the PDF PostScript content and paste it in a newly created PDF file (where encoding='ansi' ), the file doesn't work. I am sure that this may be encoding issue but I am not sure what I should do to have a valid PDF file format after manipulating the original PostScript content. Here is the piece of code that didn't work with me: pdf_file =

Change metadata of pdf file with pypdf2

谁都会走 提交于 2019-12-21 20:58:55
问题 I want to add a metadata key-value pair to the metadata of a pdf file. I found a several years old answer, but I think this is way to complicated. I guess there is an easier way today: https://stackoverflow.com/a/3257340/633961 I am not married with pypdf2, if there is an easier way, then I go this way? 回答1: You can do that using pdfrw pip install pdfrw Then run from pdfrw import PdfReader, PdfWriter trailer = PdfReader("myfile.pdf") trailer.Info.WhoAmI = "Tarun Lalwani" PdfWriter("edited.pdf

Merge Multiple PDF's into one PDF

回眸只為那壹抹淺笑 提交于 2019-12-18 18:32:03
问题 I am having some issues with my code. I am trying to loop through a Drive folder that contains many PDFs and then merge these into one file. When I use my code it just creates a PDF for the last PDF in the Drive folder and not merge them all together as expected. function MergeFiles(){ var folder = DocsList.getFolderById('myFolderID'); var files = folder.getFiles(); var blobs = []; for( var i in files ) blobs.push(files[i].getBlob().getBytes()); Logger.log(blobs.push(files[i].getBlob()

Splitting single page into two pages with ghostscript

拈花ヽ惹草 提交于 2019-12-12 07:22:21
问题 I have a pdf with something like presentations slides and multiple slides per page. How can I use ghostscript to split the file so that there is one slide per page? 回答1: A long time ago I wrote some code for someone on comp.lang.postscript to do this, again it was for PowerPoint slides. This PostScript code assumes that all the 'subpages' (ie slides) are the same size and location on the PDF page and that all the PDF pages are the same size. Save the following as a file called pdf_slice.ps

Python PyPDF2 join pages

若如初见. 提交于 2019-12-11 16:38:22
问题 I have a PDF with a big table splitted in pages, so I need to join the per-page tables into a big table in a large page. Is this possible with PyPDF2 or another library? Cheers 回答1: Just working on something similar, it takes an input pdf and via a config file you can set the final pattern of single pages. Implementation with PyPDF2 but it still has issues with some pdf-files (have to dig deeper). https://github.com/Lageos/pdf-stitcher In principle adding a page right to another one works