Convert Pdf file pages to Images with itextsharp

后端 未结 4 1099
南方客
南方客 2020-11-30 09:48

I want to convert Pdf pages in Images using ItextSharp lib.

Have any idea how to convert each page in image file

相关标签:
4条回答
  • 2020-11-30 10:14

    you can use ImageMagick convert pdf to image

    convert -density 300 "d:\1.pdf" -scale @1500000 "d:\a.jpg"

    and split pdf can use itextsharp

    here is the code from others.

    void SplitePDF(string filepath)
        {
            iTextSharp.text.pdf.PdfReader reader = null;
            int currentPage = 1;
            int pageCount = 0;
            //string filepath_New = filepath + "\\PDFDestination\\";
    
            System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
            //byte[] arrayofPassword = encoding.GetBytes(ExistingFilePassword);
            reader = new iTextSharp.text.pdf.PdfReader(filepath);
            reader.RemoveUnusedObjects();
            pageCount = reader.NumberOfPages;
            string ext = System.IO.Path.GetExtension(filepath);
            for (int i = 1; i <= pageCount; i++)
            {
                iTextSharp.text.pdf.PdfReader reader1 = new iTextSharp.text.pdf.PdfReader(filepath);
                string outfile = filepath.Replace((System.IO.Path.GetFileName(filepath)), (System.IO.Path.GetFileName(filepath).Replace(".pdf", "") + "_" + i.ToString()) + ext);
                reader1.RemoveUnusedObjects();
                iTextSharp.text.Document doc = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(currentPage));
                iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(doc, new System.IO.FileStream(outfile, System.IO.FileMode.Create));
                doc.Open();
                for (int j = 1; j <= 1; j++)
                {
                    iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader1, currentPage);
                    pdfCpy.SetFullCompression();
                    pdfCpy.AddPage(page);
                    currentPage += 1;
                }
                doc.Close();
                pdfCpy.Close();
                reader1.Close();
                reader.Close();
    
            }
        }
    
    0 讨论(0)
  • 2020-11-30 10:23

    iText/iTextSharp can generate and/or modify existing PDFs but they do not perform any rendering which is what you are looking for. I would recommend checking out Ghostscript or some other library that knows how to actually render a PDF.

    0 讨论(0)
  • 2020-11-30 10:37

    you can extract Image from PDF and save as JPG here is the sample code you need Itext Sharp

     public IEnumerable<System.Drawing.Image> ExtractImagesFromPDF(string sourcePdf)
        {
            // NOTE:  This will only get the first image it finds per page.
            var pdf = new PdfReader(sourcePdf);
            var raf = new RandomAccessFileOrArray(sourcePdf);
    
            try
            {
                for (int pageNum = 1; pageNum <= pdf.NumberOfPages; pageNum++)
                {
                    PdfDictionary pg = pdf.GetPageN(pageNum);
    
                    // recursively search pages, forms and groups for images.
                    PdfObject obj = ExtractImagesFromPDF_FindImageInPDFDictionary(pg);
                    if (obj != null)
                    {
                        int XrefIndex = Convert.ToInt32(((PRIndirectReference)obj).Number.ToString(CultureInfo.InvariantCulture));
                        PdfObject pdfObj = pdf.GetPdfObject(XrefIndex);
                        PdfStream pdfStrem = (PdfStream)pdfObj;
                        PdfImageObject pdfImage = new PdfImageObject((PRStream)pdfStrem);
                        System.Drawing.Image img = pdfImage.GetDrawingImage();
                        yield return img;
                    }
                }
            }
            finally
            {
                pdf.Close();
                raf.Close();
            }
        }
    
    0 讨论(0)
  • 2020-11-30 10:38

    You can use Ghostscript to convert the PDF files into Images, I used the following parameters to convert the needed PDF into tiff image with multiple frames :

    gswin32c.exe   -sDEVICE=tiff12nc -dBATCH -r200 -dNOPAUSE  -sOutputFile=[Output].tiff [PDF FileName]
    

    Also you can use the -q parameter for silent mode You can get more information about its output devices from here

    After that I can easily load the tiff frames like the following

    using (FileStream stream = new FileStream(@"C:\tEMP\image_$i.tiff", FileMode.Open, FileAccess.Read, FileShare.Read))
    {
        BitmapDecoder dec = BitmapDecoder.Create(stream, BitmapCreateOptions.IgnoreImageCache, BitmapCacheOption.None);
        BitmapEncoder enc = BitmapEncoder.Create(dec.CodecInfo.ContainerFormat);
        enc.Frames.Add(dec.Frames[frameIndex]);
    }
    
    0 讨论(0)
提交回复
热议问题