I\'ve scoured the Web looking for examples on how to do this. I\'ve found a few that seem to be a little more involved then they need to be. So my question is, using iTextShar
I really may be missing something, but I did something much simpler. I concede this solution probably won't update bookmarks (as in the best answer here so far), but it works flawlessly for me. Since I was merging documents with fillable forms, I used PdfCopyFields instead of PdfCopy.
Here is the code (I've stripped all error handling to make the actual code more visible, add a try..finally to close opened resources if you plan on using the code):
void MergePdfStreams(List<Stream> Source, Stream Dest)
{
PdfCopyFields copy = new PdfCopyFields(Dest);
foreach (Stream source in Source)
{
PdfReader reader = new PdfReader(source);
copy.AddDocument(reader);
}
copy.Close();
}
You can pass any stream, be it a FileStream, a MemoryStream (useful when reading the PDF from databases, no need for temporary files, etc.)
Sample usage:
void TestMergePdfStreams()
{
List<Stream> sources = new List<Stream>()
{
new FileStream("template1.pdf", FileMode.Open),
new FileStream("template2.pdf", FileMode.Open),
new MemoryStream((byte[])someDataRow["PDF_COLUMN_NAME"])
};
MergePdfStreams(sources, new FileStream("MergedOutput.pdf", FileMode.Create));
}
Yes. I've seen a class called PdfManipulation posted in an iText forum. Using that class would involve a third file though.
The class is originally in VB.Net. I downloaded it from a post on vbforums.com. Apparently though, it doesn't have the merge files function, so I wrote one based on the code in that class.
This was written on a machine without iTextSharp. This might have bugs. I'm not even sure if page numbers are 0-based or 1-based. But give it a shot.
public static void MergePdfFiles(IEnumerable<string> files, string output) {
iTextSharp.text.Document doc;
iTextSharp.text.pdf.PdfCopy pdfCpy;
doc = new iTextSharp.text.Document();
pdfCpy = new iTextSharp.text.pdf.PdfCopy(doc, new System.IO.FileStream(output, System.IO.FileMode.Create));
doc.Open();
foreach (string file in files) {
// initialize a reader
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(file);
int pageCount = reader.NumberOfPages;
// set page size for the documents
doc.SetPageSize(reader.GetPageSizeWithRotation(1));
for (int pageNum = 1; pageNum <= pageCount; pageNum++) {
iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader, pageNum);
pdfCpy.AddPage(page);
}
reader.Close();
}
doc.Close();
}
Ok, It's not straight forward, but it works and is surprisingly fast. (And it uses a 3rd file, no such thing as open and append.) I 'discovered' this in the docs/examples. Here's the code:
private void CombineMultiplePDFs( string[] fileNames, string outFile ) {
int pageOffset = 0;
ArrayList master = new ArrayList();
int f = 0;
Document document = null;
PdfCopy writer = null;
while ( f < fileNames.Length ) {
// we create a reader for a certain document
PdfReader reader = new PdfReader( fileNames[ f ] );
reader.ConsolidateNamedDestinations();
// we retrieve the total number of pages
int n = reader.NumberOfPages;
ArrayList bookmarks = SimpleBookmark.GetBookmark( reader );
if ( bookmarks != null ) {
if ( pageOffset != 0 ) {
SimpleBookmark.ShiftPageNumbers( bookmarks, pageOffset, null );
}
master.AddRange( bookmarks );
}
pageOffset += n;
if ( f == 0 ) {
// step 1: creation of a document-object
document = new Document( reader.GetPageSizeWithRotation( 1 ) );
// step 2: we create a writer that listens to the document
writer = new PdfCopy( document, new FileStream( outFile, FileMode.Create ) );
// step 3: we open the document
document.Open();
}
// step 4: we add content
for ( int i = 0; i < n; ) {
++i;
if ( writer != null ) {
PdfImportedPage page = writer.GetImportedPage( reader, i );
writer.AddPage( page );
}
}
PRAcroForm form = reader.AcroForm;
if ( form != null && writer != null ) {
writer.CopyAcroForm( reader );
}
f++;
}
if ( master.Count > 0 && writer != null ) {
writer.Outlines = master;
}
// step 5: we close the document
if ( document != null ) {
document.Close();
}
}
I don't know how to do it for PDF files, but for postscript, you just concatenate the files. If you have pdf2ps and ps2pdf installed, the below will do the job:
pdf2ps file1.pdf file1.ps
pdf2ps file2.pdf file2.ps
cat file1.ps file2.ps > combined.ps
ps2pdf combined.ps combined.pdf
I'm not an expert on pdf2ps or ps2pdf. I've only ever used ps2pdf, and when I do so, it leaves text as text (I can still select and copy text from the resulting pdf). When I do the above steps (pdf->ps, combine, ps->pdf) I end up with a resulting pdf that is like an image. No idea why.