If you're scraping a table has an explicit "thead" and "tbody" such as:
<table>
<thead>
<tr>
<th>Total</th>
<th>Finished</th>
<th>Unfinished</th>
</tr>
</thead>
<tbody>
<tr> <td>63</td> <td>33</td> <td>2</td> </tr>
<tr> <td>69</td> <td>29</td> <td>3</td> </tr>
<tr> <td>57</td> <td>28</td> <td>1</td> </tr>
</tbody>
</table>
You can use the following:
headers = [header.text_content() for header in table.cssselect("thead tr th")]
results = [{headers[i]: cell.text_content() for i, cell in enumerate(row.cssselect("td"))} for row in table.cssselect("tbody tr")]
This will produce:
[
{"Total": "63", "Finished": "33", "Unfinished": "2"},
{"Total": "69", "Finished": "29", "Unfinished": "3"},
{"Total": "57", "Finished": "28", "Unfinished": "1"}
]
P.S. This is using lxml.html. If you are using BeautifulSoup replace ".text_content()" with ".string" and ".cssselect" with ".findAll".