Camelot is a fantastic Python library to extract the tables from a pdf file as a data frame. However, I\'m looking for a solution that also returns the table description text wr
You can create the Lattice parser directly
parser = Lattice(**kwargs)
for p in pages:
t = parser.extract_tables(p, suppress_stdout=suppress_stdout,
layout_kwargs=layout_kwargs)
tables.extend(t)
Then you have access to parser.layout
which contains all the components in the page. These components all have bbox (x0, y0, x1, y1)
and the extracted tables also have a bbox
object. You can find the closest component to the table on top of it and extract the text.