Traverse whole PDF and change blue color to black ( Change color of underlines as well) + iText

前端 未结 1 1552
既然无缘
既然无缘 2021-01-16 12:11

I am using below code to remove blue colors from pdf text. It is working fine. But it is not changing underlines color, but changing text color correctly.

original fi

相关标签:
1条回答
  • 2021-01-16 12:36

    (The example code here uses iText 7 for Java. You mentioned neither the iText version nor your programming environment in tags or question text but your example code appears to indicate that this is your combination of choice.)

    Replacing blue fill colors

    The test you based your original code on attempts explicitly only to change text color. The "underline" in your document, though, is (as far as PDF drawing is concerned) not part of the text but instead drawn as a simple path. Thus, the underline explicitly is not touched by the original code and it has to be adapted for your task.

    But actually your task, changing everything blue to black, is easier to implement than only changing the blue text, e.g.

    try (   PdfReader pdfReader = new PdfReader(SOURCE_PDF);
            PdfWriter pdfWriter = new PdfWriter(RESULT_PDF);
            PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
    {
        PdfCanvasEditor editor = new PdfCanvasEditor()
        {
            @Override
            protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
            {
                String operatorString = operator.toString();
    
                if (SET_FILL_RGB.equals(operatorString) && operands.size() == 4) {
                    if (isApproximatelyEqual(operands.get(0), 0) &&
                            isApproximatelyEqual(operands.get(1), 0) &&
                            isApproximatelyEqual(operands.get(2), 1)) {
                        super.write(processor, new PdfLiteral("g"), Arrays.asList(new PdfNumber(0), new PdfLiteral("g")));
                        return;
                    }
                }
                
                super.write(processor, operator, operands);
            }
    
            boolean isApproximatelyEqual(PdfObject number, float reference) {
                return number instanceof PdfNumber && Math.abs(reference - ((PdfNumber)number).floatValue()) < 0.01f;
            }
    
            final String SET_FILL_RGB = "rg";
        };
        for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
        {
            editor.editPage(pdfDocument, i);
        }
    }
    

    (ChangeColor test testChangeFillRgbBlueToBlack)

    Beware, this is merely a proof-of-concept, not a final and complete solution. In particular:

    • It merely looks at the fill (non-stroking) colors. In your case that suffices as both your text (as usual) and your underline use fill colors only - the underline actually is not drawn as a stroked line but instead as a slim, filled rectangle.
    • Only RGB blue (and only such blue set using the rg instruction, not set using sc or scn, let alone blues combined out of other colors using funky blend modes) is considered. This might be an issue particularly in case of documents explicitly designed for printing (likely using CMYK colors).
    • PdfCanvasEditor only inspects and edits the content stream of the page itself, not the content streams of displayed form XObjects or patterns; thus, some content may not be found. It can be generalized fairly easily.

    The result:

    Replacing blue fill and stroke colors

    Testing the code above you soon found documents in which the underlines were not changed. As it turned out, these underlines are actually drawn as stroked lines, not as filled rectangle as above.

    To also properly edit such documents, therefore, you must not only edit the fill colors but also the stroke colors, e.g. like this:

    try (   PdfReader pdfReader = new PdfReader(SOURCE_PDF);
            PdfWriter pdfWriter = new PdfWriter(RESULT_PDF);
            PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
    {
        PdfCanvasEditor editor = new PdfCanvasEditor()
        {
            @Override
            protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
            {
                String operatorString = operator.toString();
    
                if (SET_FILL_RGB.equals(operatorString) && operands.size() == 4) {
                    if (isApproximatelyEqual(operands.get(0), 0) &&
                            isApproximatelyEqual(operands.get(1), 0) &&
                            isApproximatelyEqual(operands.get(2), 1)) {
                        super.write(processor, new PdfLiteral("g"), Arrays.asList(new PdfNumber(0), new PdfLiteral("g")));
                        return;
                    }
                }
    
                if (SET_STROKE_RGB.equals(operatorString) && operands.size() == 4) {
                    if (isApproximatelyEqual(operands.get(0), 0) &&
                            isApproximatelyEqual(operands.get(1), 0) &&
                            isApproximatelyEqual(operands.get(2), 1)) {
                        super.write(processor, new PdfLiteral("G"), Arrays.asList(new PdfNumber(0), new PdfLiteral("G")));
                        return;
                    }
                }
    
                super.write(processor, operator, operands);
            }
    
            boolean isApproximatelyEqual(PdfObject number, float reference) {
                return number instanceof PdfNumber && Math.abs(reference - ((PdfNumber)number).floatValue()) < 0.01f;
            }
    
            final String SET_FILL_RGB = "rg";
            final String SET_STROKE_RGB = "RG";
        };
        for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
        {
            editor.editPage(pdfDocument, i);
        }
    }
    

    (ChangeColor tests testChangeRgbBlueToBlackControlOfNitrosamineImpuritiesInSartansRev and testChangeRgbBlueToBlackEdqmReportsIssuesOfNonComplianceWithToothMac)

    The results:

    and

    Replacing different shades of blue from other RGB'ish color spaces

    Testing the code above you again found documents in which the blue colors were not changed. As it turned out, these blue colors were not from the DeviceRGB standard RGB but instead from ICCBased colorspaces, profiled RGB color spaces to be more exact. In particular other color setting operators were used than before, sc / scn instead of rg. Furthermore, in one document not a pure blue 0 0 1 but instead a .17255 .3098 .63529 blue was used

    If we assume that sc and scn instructions with three numeric arguments set some flavor of RGB colors as here (in general this is an oversimplification, Lab and other color spaces can also come with 4 components, but your documents seem RGB oriented) and are less strict in recognizing the blue color, we can generalize the code above as follows:

    class AllRgbBlueToBlackConverter extends PdfCanvasEditor {
        @Override
        protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List<PdfObject> operands)
        {
            String operatorString = operator.toString();
    
            if (RGB_SETTER_CANDIDATES.contains(operatorString) && operands.size() == 4) {
                if (isBlue(operands.get(0), operands.get(1), operands.get(2))) {
                    PdfNumber number0 = new PdfNumber(0);
                    operands.set(0, number0);
                    operands.set(1, number0);
                    operands.set(2, number0);
                }
            }
    
            super.write(processor, operator, operands);
        }
    
        boolean isBlue(PdfObject red, PdfObject green, PdfObject blue) {
            if (red instanceof PdfNumber && green instanceof PdfNumber && blue instanceof PdfNumber) {
                float r = ((PdfNumber)red).floatValue();
                float g = ((PdfNumber)green).floatValue();
                float b = ((PdfNumber)blue).floatValue();
                return b > .5f && r < .9f*b && g < .9f*b;
            }
            return false;
        }
    
        final Set<String> RGB_SETTER_CANDIDATES = new HashSet<>(Arrays.asList("rg", "RG", "sc", "SC", "scn", "SCN"));
    }
    

    (ChangeColor helper class)

    Used like this

    try (   PdfReader pdfReader = new PdfReader(INPUT);
            PdfWriter pdfWriter = new PdfWriter(OUTPUT);
            PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) ) {
        PdfCanvasEditor editor = new AllRgbBlueToBlackConverter();
        for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
        {
            editor.editPage(pdfDocument, i);
        }
    }
    

    we get

    and

    0 讨论(0)
提交回复
热议问题