Scraping AJAX e-commerce site using python

前端未结

关注

 2  495

I have a problem on scraping an e-commerce site using BeautifulSoup. I did some Googling but I still can\'t solve the problem.

Please refer on the

相关标签:

2条回答

离开以前

2021-01-15 01:30

Welcome to StackOverflow! You can inspect where the ajax request is being sent to and replicate that.

In this case the request goes to this api url. You can then use requests to perform a similar request. Notice however that this api endpoint requires a correct UserAgent header. You can use a package like fake-useragent or just hardcode a string for the agent.

import requests

# fake useragent
from fake_useragent import UserAgent
user_agent = UserAgent().chrome

# or hardcode
user_agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36'

url = 'https://shopee.com.my/api/v2/search_items/?by=relevancy&keyword=h370m&limit=50&newest=0&order=desc&page_type=search'
resp = requests.get(url, headers={
    'User-Agent': user_agent
})
data = resp.json()
products = data.get('items')

0 讨论(0)

情话喂你

2021-01-15 01:42
Welcome to StackOverflow! :)

As an alternative, you can check Selenium

See example usage from documentation:
```
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()
driver.get("http://www.python.org")
assert "Python" in driver.title
elem = driver.find_element_by_name("q")
elem.clear()
elem.send_keys("pycon")
elem.send_keys(Keys.RETURN)
assert "No results found." not in driver.page_source
driver.close()
```
When you use requests (or libraries like Scrapy) usually JavaScript not loaded. As @dmitrybelyakov mentioned you can reply these calls or imitate normal user interaction using Selenium.
0 讨论(0)
发布评论:

提交评论
- 加载中...