Scrape Multiple URLs using Beautiful Soup

前端 未结 2 1485
南笙
南笙 2021-02-02 02:56

I\'m trying to extract specific classes from multiple URLs. The tags and classes stay the same but I need my python program to scrape all as I just input my link.

Here\'

2条回答
  •  孤独总比滥情好
    2021-02-02 03:17

    Have a list of urls and iterate through it.

    from bs4 import BeautifulSoup
    import requests
    import pprint
    import re
    import pyperclip
    
    urls = ['www.website1.com', 'www.website2.com', 'www.website3.com', .....]
    #scrape elements
    for url in urls:
        response = requests.get(url)
        soup = BeautifulSoup(response.content, "html.parser")
    
        #print titles only
        h1 = soup.find("h1", class_= "class-headline")
        print(h1.get_text())
    

    If you are going to prompt user for input for each site then it can be done this way

    from bs4 import BeautifulSoup
    import requests
    import pprint
    import re
    import pyperclip
    
    urls = ['www.website1.com', 'www.website2.com', 'www.website3.com', .....]
    #scrape elements
    msg = 'Enter Url, to exit type q and hit enter.'
    url = input(msg)
    while(url!='q'):
        response = requests.get(url)
        soup = BeautifulSoup(response.content, "html.parser")
    
        #print titles only
        h1 = soup.find("h1", class_= "class-headline")
        print(h1.get_text())
        input(msg)
    

提交回复
热议问题