beautifulsoup | 易学教程

BS4 Searching by Class_ Returning Empty

阅读更多关于 BS4 Searching by Class_ Returning Empty

问题 I currently am successfully scraping the data I need by chaining bs4 .contents together following a find_all('div') , but that seems inherently fragile. I'd like to go directly to the tag I need by class, but my "class_=" search is returning None . I ran the following code on the html below, which returns None : soup = BeautifulSoup(text) # this works fine tag = soup.find(class_ = "loan-section-content") # this returns None Also tried soup.find('div', class_ = "loan-section-content") - also

BS4 Searching by Class_ Returning Empty

阅读更多关于 BS4 Searching by Class_ Returning Empty

Is it possible to just get the tags without a class or id with BeautifulSoup?

阅读更多关于 Is it possible to just get the tags without a class or id with BeautifulSoup?

问题 I have several thousands HTML sites and I am trying to filter the text from these sites. I am doing this with beautiful soup. get_text() gives me to much unecessary information from these sites. Therefore I wrote a loop: l = [] for line in text5: soup = bs(line, 'html.parser') p_text = ' '.join(p.text for p in soup.find_all('p')) k = p_text.replace('\n', '') l.append(k) But this loop gives me everything that was in a tag that starts with <p . For example: I want everything between two plain

how to use selenium to go from one url tab to another before scraping?

阅读更多关于 how to use selenium to go from one url tab to another before scraping?

问题 I have created the following code in hopes to open up a new tab with a few parameters and then scrape the data table that is on the new tab. #Open Webpage url = "https://www.website.com" driver=webdriver.Chrome(executable_path=r"C:\mypathto\chromedriver.exe") driver.get(url) #Click Necessary Parameters driver.find_element_by_partial_link_text('Output').click() driver.find_element_by_xpath('//*[@id="flexOpt"]/table/tbody/tr/td[2]/input[3]').click() driver.find_element_by_xpath('//*[@id=

how to use selenium to go from one url tab to another before scraping?

阅读更多关于 how to use selenium to go from one url tab to another before scraping?

how to use selenium to go from one url tab to another before scraping?

阅读更多关于 how to use selenium to go from one url tab to another before scraping?

python requests not getting full page

阅读更多关于 python requests not getting full page

问题 """THIS IS MY CODE """ import requests from bs4 import BeautifulSoup import random from selenium import webdriver url ="http://www.yopmail.com/en/?smith" request = requests.get(url) soup = BeautifulSoup(request.text, 'html5lib') print(soup) """IT RETURNING THIS OUTPUT """ <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"><head> <meta content="text/html; charset=utf-8" http-equiv=

python requests not getting full page

阅读更多关于 python requests not getting full page

AttributeError: ResultSet object has no attribute 'get_text'. You're probably treating a list of elements like a single element

阅读更多关于 AttributeError: ResultSet object has no attribute 'get_text'. You're probably treating a list of elements like a single element

问题 I got the following list of lists from parsing with Bs4 through the snippet: details = [i.find_all('span', {'class':re.compile('item')}) for i in cars] [[Red col., 120 cc., Available in four days, 15 min], [Blue col., 200 cc.

get financial data using Python

阅读更多关于 get financial data using Python

问题 I have managed to write some Python code and Selenium that navigates to a webpage that contains financial data that is in some tables. I want to be able to extract the data and put it into excel. The tables seem to be html based tables code below: <tr> <td class="bc2T bc2gt">Last update</td> <td class="bc2V bc2D">03/15/2018</td><td class="bc2V bc2D">03/14/2019</td><td class="bc2V bc2D">03/12/2020</td><td class="bc2V bc2D" style="background-color:#DEFEFE;">05/22/2020</td><td class="bc2V bc2D"