web-scraping

Cant scrape google search results with beautifulsoup

北慕城南 提交于 2021-02-10 14:55:51
问题 I want to scrape google search results , but whenever i try to do so, the program returns an empty list from bs4 import BeautifulSoup import requests keyWord = input("Input Your KeyWord :") url = f'https://www.google.com/search?q={keyWord}' src = requests.get(url).text soup = BeautifulSoup(src, 'lxml') container = soup.findAll('div', class_='g') print(container) 回答1: To get correct result page from google, specify User-Agent http header. For only english results put hl=en parameter in URL:

How to count the elements from a class with VBA + SELENIUM in chrome?

一世执手 提交于 2021-02-10 12:26:08
问题 I want to count the numbers of elements aiming to get all their names and store in an array. The names are highlighted in this image The names are store in "js-list list-wrapper" as shown in image My code: Public Sub seleniumtutorial() Dim bot As New SeleniumWrapper.WebDriver Dim a As WebElement Dim b As WebElement Dim x() As Integer bot.Start "chrome", "https://trello.com/login" bot.get "/" bot.Type "name=user", "biaverly@id.uff.br" bot.Type "name=password", "carambola69" bot.clickAndWait

How to count the elements from a class with VBA + SELENIUM in chrome?

╄→尐↘猪︶ㄣ 提交于 2021-02-10 12:25:28
问题 I want to count the numbers of elements aiming to get all their names and store in an array. The names are highlighted in this image The names are store in "js-list list-wrapper" as shown in image My code: Public Sub seleniumtutorial() Dim bot As New SeleniumWrapper.WebDriver Dim a As WebElement Dim b As WebElement Dim x() As Integer bot.Start "chrome", "https://trello.com/login" bot.get "/" bot.Type "name=user", "biaverly@id.uff.br" bot.Type "name=password", "carambola69" bot.clickAndWait

How to scrape a page if it is redirected to another before

£可爱£侵袭症+ 提交于 2021-02-10 12:18:30
问题 I am trying to scrape some text off of https://www.memrise.com/course/2021573/french-1-145/garden/speed_review/?source_element=ms_mode&source_screen=eos_ms , but as you can see when it loads up the link through web-driver it automatically redirects it to a log in page. After I log in, it then goes straight to the page I want to scrape, but Beautiful Soup just keeps scraping the log in page. How do I make it so Beautiful Soup scrapes the page I want it to and not the login page? I have already

How to scrape a page if it is redirected to another before

删除回忆录丶 提交于 2021-02-10 12:18:26
问题 I am trying to scrape some text off of https://www.memrise.com/course/2021573/french-1-145/garden/speed_review/?source_element=ms_mode&source_screen=eos_ms , but as you can see when it loads up the link through web-driver it automatically redirects it to a log in page. After I log in, it then goes straight to the page I want to scrape, but Beautiful Soup just keeps scraping the log in page. How do I make it so Beautiful Soup scrapes the page I want it to and not the login page? I have already

Scrapy only returns first result

南笙酒味 提交于 2021-02-10 09:35:51
问题 I'm trying to scrape preformatted html seen here. But my code only returns 1 price instead of all 10 prices. Code seen here: class MySpider(BaseSpider): name = "working1" allowed_domains = ["steamcommunity.com"] start_urls = ["http://steamcommunity.com/market/search/render/?query=&appid=440"] def parse(self, response): sel = Selector(response) price = sel.xpath("//text()[contains(.,'$')]").extract()[0].replace('\\r\\n\\t\\t\\t\\r\\n\\t\\t\\t','') print price I'm super new to scrapy/xpath so I

Scrapy only returns first result

。_饼干妹妹 提交于 2021-02-10 09:31:52
问题 I'm trying to scrape preformatted html seen here. But my code only returns 1 price instead of all 10 prices. Code seen here: class MySpider(BaseSpider): name = "working1" allowed_domains = ["steamcommunity.com"] start_urls = ["http://steamcommunity.com/market/search/render/?query=&appid=440"] def parse(self, response): sel = Selector(response) price = sel.xpath("//text()[contains(.,'$')]").extract()[0].replace('\\r\\n\\t\\t\\t\\r\\n\\t\\t\\t','') print price I'm super new to scrapy/xpath so I

Scrapy only returns first result

家住魔仙堡 提交于 2021-02-10 09:30:49
问题 I'm trying to scrape preformatted html seen here. But my code only returns 1 price instead of all 10 prices. Code seen here: class MySpider(BaseSpider): name = "working1" allowed_domains = ["steamcommunity.com"] start_urls = ["http://steamcommunity.com/market/search/render/?query=&appid=440"] def parse(self, response): sel = Selector(response) price = sel.xpath("//text()[contains(.,'$')]").extract()[0].replace('\\r\\n\\t\\t\\t\\r\\n\\t\\t\\t','') print price I'm super new to scrapy/xpath so I

BeautifulSoup can't find required div

烈酒焚心 提交于 2021-02-10 07:01:22
问题 I have been trying to get at a nested div and its contents but am not able to. I want to access the div with class:'box coursebox'. response = res.read() soup = BeautifulSoup(response, "html.parser") div = soup.find_all('div', attrs={'class':'box coursebox'}) The above code gives a div with 0 elements, when there should be 8. find_all calls before this line work perfectly. Thanks for helping! 回答1: In the case of attributes having more than one value, Beautiful Soup puts all the values into a

BeautifulSoup can't find required div

不想你离开。 提交于 2021-02-10 07:01:21
问题 I have been trying to get at a nested div and its contents but am not able to. I want to access the div with class:'box coursebox'. response = res.read() soup = BeautifulSoup(response, "html.parser") div = soup.find_all('div', attrs={'class':'box coursebox'}) The above code gives a div with 0 elements, when there should be 8. find_all calls before this line work perfectly. Thanks for helping! 回答1: In the case of attributes having more than one value, Beautiful Soup puts all the values into a