PDF Parsing with Text and Coordinates
问题 I am currently using PDF Box to parse a pdf and I am trying to figure out how to retrieve data about the text such as the font (bold, size, etc) and the location of the font. Any suggestions? 回答1: After poking around the (hard to find) PDFBox docs, I found this little gem. Apparently one of the examples shows exactly how to do everything you asked. Basically, you subclass PdfTextStripper and override the processTextPosition method. There, you query the TextPosition for whatever information