I have been battling with Google and the limited documentation of PDFMiner for the last several hours, and although I feel close, I\'m just not getting what I need. I\'ve worke
I was able to find my way around pdfminer thanks to some code by Denis Papathanasiou. The code is discussed in his blog, and you can find the source here: layout_scanner.py
In particular, take a look at the method parse_lt_objs( ). In the final loop, k should be a pair containing the coordinates of that bit of text (and it is discarded). I don't have a working coordinate extractor to post here (I was not interested in them), but it sounds like you'll have no trouble finding your way from there.
Good luck with it!