behavior of PdfPage.flush()

只愿长相守 提交于 2021-01-29 04:27:53

问题


what exactly does PdfPage.flush(true) do? Does SmartMode (or any other setting) affect the behavior? For many cases, i want to leave the page editable for as long as possible, so never worried that the PDF document was assembled in memory until document.close(). But when generating very large files (tens of thousands of pages), memory is becoming constrained. I was naively hoping that PdfPage.flush(true) would write the content stream to disk and free up memory, but calling flush(true) only seems to write a couple of bytes to disk.

I guess the more general version of my question is "how do we efficiently merge lots of documents into a single, very-large document? (itext7)" but not being highly proficient w/ the PDF spec itself i'd also like to better understand what's actually going on.


回答1:


flush(), when called on layout objects, forces those objects and their children to draw (== write) their contents to the writer's outputstream. The reason why you only see a couple of bytes being written when manually calling flush() is because the default Document constructors already set iText to flush aggressively by overloading the relevant constructors:

/**
 * Creates a document from a {@link PdfDocument} with a manually set {@link
 * PageSize}.
 *
 * @param pdfDoc   the in-memory representation of the PDF document
 * @param pageSize the page size
 */
public Document(PdfDocument pdfDoc, PageSize pageSize) {
    this(pdfDoc, pageSize, true);
}

/**
 * Creates a document from a {@link PdfDocument} with a manually set {@link
 * PageSize}.
 *
 * @param pdfDoc         the in-memory representation of the PDF document
 * @param pageSize       the page size
 * @param immediateFlush if true, write pages and page-related instructions
 *                       to the {@link PdfDocument} as soon as possible.
 */
public Document(PdfDocument pdfDoc, PageSize pageSize, boolean immediateFlush)

As for advice on the general question: There isn't really some sort of iText function or configuration that makes the entire progress magically faster and more efficient, but there are some tricks you can do outside of iText:

1) Allocate more resources, obvious and often not feasible.

2) Do multi-stage batch processing: merge 10-files into 1 in step X, continue with merging those files in step X+1. In general, the 1 big-file will be smaller than the 10 files seperatly, because of the possible re-use of resources such as fonts and images.

3) Run the merging process at times the resources it takes up aren't needed anywhere else, e.g., at night, over lunch etc.

Edit: As for why PdfPage#flush() only writes a couple of bytes to the contentstream, that depends on the input document, but it most likely points towards a page being flushed that either has mostly text content or a lot of shared resources. SmartMode should limit the amount written to the outputstream that a page flushes, as long as the page contains resources that have been copied before.



来源:https://stackoverflow.com/questions/41616002/behavior-of-pdfpage-flush

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!