Extracting contents from specific meta tags that are not closed using BeautifulSoup

前端未结

关注

 6  1334

孤街浪徒 2020-12-28 09:34

I\'m trying to parse out content from specific meta tags. Here\'s the structure of the meta tags. The first two are closed with a backslash, but the rest don\'t have any clo

6条回答

时光说笑 (楼主)

2020-12-28 10:29

Description is Case-Sensitive.So, we need to look for both 'Description' and 'description'.

Case1: 'Description' in Flipkart.com

Case2: 'description' in Snapdeal.com

from bs4 import BeautifulSoup
import requests

url= 'https://www.flipkart.com'
page3= requests.get(url)
soup3= BeautifulSoup(page3.text)
desc= soup3.find(attrs={'name':'Description'})
if desc == None:
    desc= soup3.find(attrs={'name':'description'})
try:
    print desc['content']
except Exception as e:
    print '%s (%s)' % (e.message, type(e))

0 讨论(0)

查看其它6个回答