Extracting contents from specific meta tags that are not closed using BeautifulSoup

前端 未结 6 1334
孤街浪徒
孤街浪徒 2020-12-28 09:34

I\'m trying to parse out content from specific meta tags. Here\'s the structure of the meta tags. The first two are closed with a backslash, but the rest don\'t have any clo

6条回答
  •  时光说笑
    2020-12-28 10:29

    Description is Case-Sensitive.So, we need to look for both 'Description' and 'description'.

    Case1: 'Description' in Flipkart.com

    Case2: 'description' in Snapdeal.com

    from bs4 import BeautifulSoup
    import requests
    
    url= 'https://www.flipkart.com'
    page3= requests.get(url)
    soup3= BeautifulSoup(page3.text)
    desc= soup3.find(attrs={'name':'Description'})
    if desc == None:
        desc= soup3.find(attrs={'name':'description'})
    try:
        print desc['content']
    except Exception as e:
        print '%s (%s)' % (e.message, type(e))
    

提交回复
热议问题