I am trying to scrape a simple table using Beautiful Soup. Here is my code:
import requests
from bs4 import BeautifulSoup
url = \'https://gist.githubusercon
Iterate over table and use rowfind_all('td')
for row in table:
col = row.find_all('td')
The table
variable contains an array. You would need to call find_all
on its members (even though you know it's an array with only one member), not on the entire thing.
>>> type(table)
<class 'bs4.element.ResultSet'>
>>> type(table[0])
<class 'bs4.element.Tag'>
>>> len(table[0].find_all('tr'))
6
>>>
table = soup.find_all(class_='dataframe')
This gives you a result set – i.e. all the elements that match the class. You can either iterate over them or, if you know you only have one dataFrame
, you can use find
instead. From your code it seems the latter is what you need, to deal with the immediate problem:
table = soup.find(class_='dataframe')
However, that is not all:
for row in table.find_all('tr'):
col = table.find_all('td')
You probably want to iterate over the td
s in the row here, rather than the whole table. (Otherwise you'll just see the first row over and over.)
for row in table.find_all('tr'):
for col in row.find_all('td'):