My friend asked if I could write a web scraping script to collect data of pokemon from a specific website.
I\'ve written the following code to render the javascript and
The data is actually present in the page source. See view-source:https://www.smogon.com/dex/ss/pokemon/
(It is present inside on the script tag as a javascript variable).
import requests
import re
import json
response = requests.get('https://www.smogon.com/dex/ss/pokemon/')
# The following regex will help you take the json string from the response text
data = "".join(re.findall(r'dexSettings = (\{.*\})', response.text))
# the above will only return a string, we need to parse that to json in order to process it as a regular json object using `json.loads()`
data = json.loads(data)
# now we can query json string like below.
data = data.get('injectRpcs', [])[1][1].get('items', [])
for row in data:
print(row.get('name', ''))
print(row.get('description', ''))
See it in action here