web-scraping

BeautifulSoup does not read 'full' HTML obtained by requests

只谈情不闲聊 提交于 2021-02-11 02:54:25
问题 I am trying to scrape URL's from a website presented as HTML using the BeautifulSoup and requests libraries. I am running both of them on Python 3.5. It seems I am succesfully getting the HTML from requests because when I display r.content, the full HTML of the website I am trying to scrape is displayed. However, when I pass this to BeautifulSoup, BeautifulSoup drops the bulk of the HTML, including the URL I am trying to scrape. from bs4 import BeautifulSoup import requests page = requests

BeautifulSoup does not read 'full' HTML obtained by requests

拜拜、爱过 提交于 2021-02-11 02:45:42
问题 I am trying to scrape URL's from a website presented as HTML using the BeautifulSoup and requests libraries. I am running both of them on Python 3.5. It seems I am succesfully getting the HTML from requests because when I display r.content, the full HTML of the website I am trying to scrape is displayed. However, when I pass this to BeautifulSoup, BeautifulSoup drops the bulk of the HTML, including the URL I am trying to scrape. from bs4 import BeautifulSoup import requests page = requests

Not able to get element by xpath inside div with ::before

假装没事ソ 提交于 2021-02-11 02:11:14
问题 I need to get the list of web elements by using web driver object findElements(By.xpath("")); I get the list by using xpath as //*[@class=\"providers-list clearfix\"] .However, I get an error whenever I try to fetch element inside <div class="providers-list clearfix">::before <div class="data-container">..</div> </div> This xpath gives me error: // [@class=\"data-container\"]" as no such element: Unable to locate element: {"method":"xpath","selector":"// [@class="data-container"]"} 回答1:

Not able to get element by xpath inside div with ::before

南楼画角 提交于 2021-02-11 02:04:34
问题 I need to get the list of web elements by using web driver object findElements(By.xpath("")); I get the list by using xpath as //*[@class=\"providers-list clearfix\"] .However, I get an error whenever I try to fetch element inside <div class="providers-list clearfix">::before <div class="data-container">..</div> </div> This xpath gives me error: // [@class=\"data-container\"]" as no such element: Unable to locate element: {"method":"xpath","selector":"// [@class="data-container"]"} 回答1:

Not able to get element by xpath inside div with ::before

这一生的挚爱 提交于 2021-02-11 02:01:21
问题 I need to get the list of web elements by using web driver object findElements(By.xpath("")); I get the list by using xpath as //*[@class=\"providers-list clearfix\"] .However, I get an error whenever I try to fetch element inside <div class="providers-list clearfix">::before <div class="data-container">..</div> </div> This xpath gives me error: // [@class=\"data-container\"]" as no such element: Unable to locate element: {"method":"xpath","selector":"// [@class="data-container"]"} 回答1:

Not able to get element by xpath inside div with ::before

微笑、不失礼 提交于 2021-02-11 01:59:03
问题 I need to get the list of web elements by using web driver object findElements(By.xpath("")); I get the list by using xpath as //*[@class=\"providers-list clearfix\"] .However, I get an error whenever I try to fetch element inside <div class="providers-list clearfix">::before <div class="data-container">..</div> </div> This xpath gives me error: // [@class=\"data-container\"]" as no such element: Unable to locate element: {"method":"xpath","selector":"// [@class="data-container"]"} 回答1:

Scraping facebook likes, comments and shares with Beautiful Soup

半世苍凉 提交于 2021-02-10 20:38:45
问题 I want to scrape number of likes, comments and shares with Beautiful soup and Python. I have wrote a code, but it returns me the empty list, I do not know why: this is the code: from bs4 import BeautifulSoup import requests website = "https://www.facebook.com/nike" soup = requests.get(website).text my_html = BeautifulSoup(soup, 'lxml') list_of_likes = my_html.find_all('span', class_='_81hb') print(list_of_likes) for i in list_of_likes: print(i) The same is with comments and likes. What should

Scraping facebook likes, comments and shares with Beautiful Soup

人走茶凉 提交于 2021-02-10 20:34:04
问题 I want to scrape number of likes, comments and shares with Beautiful soup and Python. I have wrote a code, but it returns me the empty list, I do not know why: this is the code: from bs4 import BeautifulSoup import requests website = "https://www.facebook.com/nike" soup = requests.get(website).text my_html = BeautifulSoup(soup, 'lxml') list_of_likes = my_html.find_all('span', class_='_81hb') print(list_of_likes) for i in list_of_likes: print(i) The same is with comments and likes. What should

Inserting NA in blank values from web scraping

无人久伴 提交于 2021-02-10 20:11:25
问题 I am working on scraping some data into a data frame, and am getting some empty fields, where I would instead prefer to have NA. I have tried na.strings, but am either placing it in the wrong place or it just isn't working, and I tried to gsub anything that was whitespace from beginning of line to end, but that didn't work. htmlpage <- read_html("http://www.gourmetsleuth.com/features/wine-cheese-pairing-guide") sugPairings <- html_nodes(htmlpage, ".meta-wrapper") suggestions <- html_text

Inserting NA in blank values from web scraping

怎甘沉沦 提交于 2021-02-10 20:10:36
问题 I am working on scraping some data into a data frame, and am getting some empty fields, where I would instead prefer to have NA. I have tried na.strings, but am either placing it in the wrong place or it just isn't working, and I tried to gsub anything that was whitespace from beginning of line to end, but that didn't work. htmlpage <- read_html("http://www.gourmetsleuth.com/features/wine-cheese-pairing-guide") sugPairings <- html_nodes(htmlpage, ".meta-wrapper") suggestions <- html_text