问题
I'm attempting to do a proof of concept by downloading a TV episode of Bob's Burgers at https://www.watchcartoononline.com/bobs-burgers-season-9-episode-3-tweentrepreneurs.
I cannot figure out how to extract the video url from this website. I used Chrome and Firefox web developer tools to figure out it is in an iframe, but extracting src urls with BeautifulSoup searching for iframes, returns links that have nothing to do with the video. Where are the references to mp4 or flv files (which I see in Developer Tools - even though clicking them is forbidden).
Any understanding on how to do video web scraping with BeautifulSoup and requests would be appreciated.
Here is some code if needed. A lot of tutorials say to use 'a' tags, but I didn't receive any 'a' tags.
import requests
from bs4 import BeautifulSoup
r = requests.get("https://www.watchcartoononline.com/bobs-burgers-season-9-episode-5-live-and-let-fly")
soup = BeautifulSoup(r.content,'html.parser')
links = soup.find_all('iframe')
for link in links:
print(link['src'])
回答1:
import requests
url = "https://disk19.cizgifilmlerizle.com/cizgi/bobs.burgers.s09e03.mp4?st=_EEVz36ktZOv7ZxlTaXZfg&e=1541637622"
def download_file(url,filename):
# NOTE the stream=True parameter
r = requests.get(url, stream=True)
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024):
if chunk: # filter out keep-alive new chunks
f.write(chunk)
#f.flush() commented by recommendation from J.F.Sebastian
return filename
download_file(url,"bobs.burgers.s09e03.mp4")
This code will download this particular episode onto your computer. The video url is nested inside the <video>
tag in the <source>
tag.
来源:https://stackoverflow.com/questions/53196594/web-scraping-videos