Extract paths and shapes with iTextSharp

前端 未结 2 1423
我寻月下人不归
我寻月下人不归 2021-01-15 16:13

iTextSharp supports creation of shapes and paths with PdfContentByte class, there you can set colors and paint curves and basic elements ... is there a mechanis

相关标签:
2条回答
  • 2021-01-15 16:52

    Here is the starting point of extracting the different commands of a page:

        var file = "test.pdf";
        var reader = new PdfReader(file);
    
        var streamBytes = reader.GetPageContent(2);
        var tokenizer = new PRTokeniser(new RandomAccessFileOrArray(streamBytes));
        var ps = new PdfContentParser(tokenizer);
    
        List<PdfObject> operands = new List<PdfObject>();
        while (ps.Parse(operands).Count > 0)
        {
            PdfLiteral oper = (PdfLiteral)operands[operands.Count - 1];
            var cmd = oper.ToString();
    
            switch (cmd)
            {
                case "q":
                    Console.WriteLine("SaveGraphicsState(); //q");
                    break;
    
                case "Q":
                    Console.WriteLine("RestoreGraphicsState(); //Q");
                    break;
    
               // good luck with the rest!
    
            }
        }
    
    0 讨论(0)
  • 2021-01-15 17:01

    That's not supported in iTextSharp. The reason: parsing for text returns TextRenderInfo objects, parsing for images returns ImageRenderInfo objects, but in which form should we return GraphicsRenderInfo? It's hard to find something generic, and painting to a graphics context is too specific.

    The idea is that you write your own parser, as I did for instance for removing OCG layers: OCGParser. This part of iText hasn't been ported to iTextSharp yet, but maybe you can use it for inspiration.

    Note that you're actually building PDF to image functionality. Aren't there other products who already support this out of the box?

    0 讨论(0)
提交回复
热议问题