问题
I have this datastructure.
<photo id="123" owner="12345" secret="xx" server="12" farm="4" title="109L_0195"
ispublic="1" isfriend="0" isfamily="0" views="0" tags="military czechrepublic kmk koně
humpolec všestrannost humpoec vysocinaregion" latitude="49.550933" longitude="15.36652"
accuracy="16" context="0" place_id="tg5cqdpWW7q18rE" woeid="790349" geo_is_family="0"
geo_is_friend="0" geo_is_contact="0" geo_is_public="1">
<description>
Kvalifikační kolo KMK - všestrannost 18.7.2014 - Humpolec
</description>
</photo>
<photo id="123" owner="06" secret="xx" server="12" farm="4"
title="Ytterligare en bild ifrån inspelningen av Johan Stjerquist's video: Nudist
Javisst." ispublic="1" isfriend="0" isfamily="0" views="0" tags="square squareformat
iphoneography instagramapp uploaded:by=instagram" latitude="56.171184"
longitude="14.741144" accuracy="16" context="0" place_id="u4MzsN9ZW7KnPWo"
woeid="898740" geo_is_family="0" geo_is_friend="0" geo_is_contact="0" geo_is_public="1">
<description/>
</photo>
Its a peace of information about a photo accessed through the Flickr API. I want to extract the following information: id title tags longitude latitude
which I tried to accomplish through this.
url = "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5....b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description"
soup = BeautifulSoup(urlopen(url))
for data in soup.find_all('photo'):
print (data.attrs['id' , 'title' , 'tags' , 'latitude' , 'longitude' , 'accuracy'])
That did not work. The attrs
accepts only one argument. Looking at the documentation of BeautifulSoup
it looks like there is no other tool which could help me getting all the information or am I mistaken (http://www.crummy.com/software/BeautifulSoup/bs4/doc/)? I tried to substitute attrs
through p
but that did not work neither.
Any ideas which command I could use?
回答1:
Since attrs is a dictionary, you can get only specific keys using dictionary comprehension:
keys = {'id', 'title', 'tags', 'latitude', 'longitude'}
for photo in soup.find_all('photo'):
print({key:value for key, value in photo.attrs.iteritems() if key in keys})
Note that you should use items()
in case of Python-3.x.
来源:https://stackoverflow.com/questions/24875004/beautiful-soup-parsing-xml