问题
I want to write a python code that downloads 'main' image from urls that contain images.
I have urls like these in my data (text files)
- http://t.co/fd9F0Gp1P1
points to an fb image
- http://t.co/0Ldy6j26fb
points to twitter image
but their expanded urls don't result in .jpg,.png images. Instead they direct us to a page that contains the desired image.
How do I download images from these urls?
回答1:
Here you will find an example of how I downloaded the plane image from the facebook page, you can adapt this to work for your twitter page:
from bs4 import BeautifulSoup
import urllib
urlData = urllib.urlopen('https://www.facebook.com/photo.php?fbid=10152055005350906')
data = str(urlData.readlines())
bs = BeautifulSoup(data)
imgUrl = bs.find('img', attrs={'class': 'fbPhotoImage img'}).get('src')
urllib.urlretrieve(imgUrl, "plane.jpg")
EDIT
I decided to actually help you out with the twitter one as well, here is the twitter example of downloading the image from the link you gave:
from bs4 import BeautifulSoup
import urllib
urlData = urllib.urlopen('https://twitter.com/USABillOfRights/status/468852515409502210/photo/1')
data = str(urlData.readlines())
bs = BeautifulSoup(data)
imgUrl = bs.find('img', attrs={'alt': 'Embedded image permalink'}).get('src')
urllib.urlretrieve(imgUrl, "cnn.jpg")
And here is the web reference for BeautifulSoup.
来源:https://stackoverflow.com/questions/24741430/getting-actual-facebook-and-twitter-image-urls-using-python