Apache PDFBox Remove Spaces between characters
问题 We are using PDFBox to extract text from PDF's. Some PDF's text can't be extract correctly. The following image shows a part from the PDF as image: After text extraction we get the following text: 3, 8 5 EU R 1 Netto 38,50 EUR 4,00 (Spaces are added between ',' and '8') Here is our code: PDDocument pdf = PDDocument.load(reuseableInputStream); PDFTextStripper pdfStripper = new PDFTextStripper(); pdfStripper.setSortByPosition(true); String text = pdfStripper.getText(pdf); We tried to play with