I\'m trying to convert a table I have extracted via BeautifulSoup into JSON.
So far I\'ve managed to isolate all the rows, though I\'m not sure how to work with the
Probably your data is something like:
html_data = """
<table>
<tr>
<td>Card balance</td>
<td>$18.30</td>
</tr>
<tr>
<td>Card name</td>
<td>NAMEn</td>
</tr>
<tr>
<td>Account holder</td>
<td>NAME</td>
</tr>
<tr>
<td>Card number</td>
<td>1234</td>
</tr>
<tr>
<td>Status</td>
<td>Active</td>
</tr>
</table>
"""
From which we can get your result as a list using this code:
from bs4 import BeautifulSoup
table_data = [[cell.text for cell in row("td")]
for row in BeautifulSoup(html_data)("tr")]
To convert the result to JSON, if you don't care about the order:
import json
print json.dumps(dict(table_data))
Result:
{
"Status": "Active",
"Card name": "NAMEn",
"Account holder":
"NAME", "Card number": "1234",
"Card balance": "$18.30"
}
If you need the same order, use this:
from collections import OrderedDict
import json
print json.dumps(OrderedDict(table_data))
Which gives you:
{
"Card balance": "$18.30",
"Card name": "NAMEn",
"Account holder": "NAME",
"Card number": "1234",
"Status": "Active"
}