I am using below code to remove blue colors from pdf text. It is working fine. But it is not changing underlines color, but changing text color correctly.
original fi
(The example code here uses iText 7 for Java. You mentioned neither the iText version nor your programming environment in tags or question text but your example code appears to indicate that this is your combination of choice.)
The test you based your original code on attempts explicitly only to change text color. The "underline" in your document, though, is (as far as PDF drawing is concerned) not part of the text but instead drawn as a simple path. Thus, the underline explicitly is not touched by the original code and it has to be adapted for your task.
But actually your task, changing everything blue to black, is easier to implement than only changing the blue text, e.g.
try ( PdfReader pdfReader = new PdfReader(SOURCE_PDF);
PdfWriter pdfWriter = new PdfWriter(RESULT_PDF);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
{
PdfCanvasEditor editor = new PdfCanvasEditor()
{
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List operands)
{
String operatorString = operator.toString();
if (SET_FILL_RGB.equals(operatorString) && operands.size() == 4) {
if (isApproximatelyEqual(operands.get(0), 0) &&
isApproximatelyEqual(operands.get(1), 0) &&
isApproximatelyEqual(operands.get(2), 1)) {
super.write(processor, new PdfLiteral("g"), Arrays.asList(new PdfNumber(0), new PdfLiteral("g")));
return;
}
}
super.write(processor, operator, operands);
}
boolean isApproximatelyEqual(PdfObject number, float reference) {
return number instanceof PdfNumber && Math.abs(reference - ((PdfNumber)number).floatValue()) < 0.01f;
}
final String SET_FILL_RGB = "rg";
};
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
{
editor.editPage(pdfDocument, i);
}
}
(ChangeColor test testChangeFillRgbBlueToBlack
)
Beware, this is merely a proof-of-concept, not a final and complete solution. In particular:
PdfCanvasEditor
only inspects and edits the content stream of the page itself, not the content streams of displayed form XObjects or patterns; thus, some content may not be found. It can be generalized fairly easily.The result:
Testing the code above you soon found documents in which the underlines were not changed. As it turned out, these underlines are actually drawn as stroked lines, not as filled rectangle as above.
To also properly edit such documents, therefore, you must not only edit the fill colors but also the stroke colors, e.g. like this:
try ( PdfReader pdfReader = new PdfReader(SOURCE_PDF);
PdfWriter pdfWriter = new PdfWriter(RESULT_PDF);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) )
{
PdfCanvasEditor editor = new PdfCanvasEditor()
{
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List operands)
{
String operatorString = operator.toString();
if (SET_FILL_RGB.equals(operatorString) && operands.size() == 4) {
if (isApproximatelyEqual(operands.get(0), 0) &&
isApproximatelyEqual(operands.get(1), 0) &&
isApproximatelyEqual(operands.get(2), 1)) {
super.write(processor, new PdfLiteral("g"), Arrays.asList(new PdfNumber(0), new PdfLiteral("g")));
return;
}
}
if (SET_STROKE_RGB.equals(operatorString) && operands.size() == 4) {
if (isApproximatelyEqual(operands.get(0), 0) &&
isApproximatelyEqual(operands.get(1), 0) &&
isApproximatelyEqual(operands.get(2), 1)) {
super.write(processor, new PdfLiteral("G"), Arrays.asList(new PdfNumber(0), new PdfLiteral("G")));
return;
}
}
super.write(processor, operator, operands);
}
boolean isApproximatelyEqual(PdfObject number, float reference) {
return number instanceof PdfNumber && Math.abs(reference - ((PdfNumber)number).floatValue()) < 0.01f;
}
final String SET_FILL_RGB = "rg";
final String SET_STROKE_RGB = "RG";
};
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
{
editor.editPage(pdfDocument, i);
}
}
(ChangeColor tests testChangeRgbBlueToBlackControlOfNitrosamineImpuritiesInSartansRev
and testChangeRgbBlueToBlackEdqmReportsIssuesOfNonComplianceWithToothMac
)
The results:
and
Testing the code above you again found documents in which the blue colors were not changed. As it turned out, these blue colors were not from the DeviceRGB standard RGB but instead from ICCBased colorspaces, profiled RGB color spaces to be more exact. In particular other color setting operators were used than before, sc / scn instead of rg. Furthermore, in one document not a pure blue 0 0 1
but instead a .17255 .3098 .63529
blue was used
If we assume that sc and scn instructions with three numeric arguments set some flavor of RGB colors as here (in general this is an oversimplification, Lab and other color spaces can also come with 4 components, but your documents seem RGB oriented) and are less strict in recognizing the blue color, we can generalize the code above as follows:
class AllRgbBlueToBlackConverter extends PdfCanvasEditor {
@Override
protected void write(PdfCanvasProcessor processor, PdfLiteral operator, List operands)
{
String operatorString = operator.toString();
if (RGB_SETTER_CANDIDATES.contains(operatorString) && operands.size() == 4) {
if (isBlue(operands.get(0), operands.get(1), operands.get(2))) {
PdfNumber number0 = new PdfNumber(0);
operands.set(0, number0);
operands.set(1, number0);
operands.set(2, number0);
}
}
super.write(processor, operator, operands);
}
boolean isBlue(PdfObject red, PdfObject green, PdfObject blue) {
if (red instanceof PdfNumber && green instanceof PdfNumber && blue instanceof PdfNumber) {
float r = ((PdfNumber)red).floatValue();
float g = ((PdfNumber)green).floatValue();
float b = ((PdfNumber)blue).floatValue();
return b > .5f && r < .9f*b && g < .9f*b;
}
return false;
}
final Set RGB_SETTER_CANDIDATES = new HashSet<>(Arrays.asList("rg", "RG", "sc", "SC", "scn", "SCN"));
}
(ChangeColor helper class)
Used like this
try ( PdfReader pdfReader = new PdfReader(INPUT);
PdfWriter pdfWriter = new PdfWriter(OUTPUT);
PdfDocument pdfDocument = new PdfDocument(pdfReader, pdfWriter) ) {
PdfCanvasEditor editor = new AllRgbBlueToBlackConverter();
for (int i = 1; i <= pdfDocument.getNumberOfPages(); i++)
{
editor.editPage(pdfDocument, i);
}
}
we get
and