Scrape Multiple URLs using Beautiful Soup

前端未结

关注

 2  1485

南笙 2021-02-02 02:56

I\'m trying to extract specific classes from multiple URLs. The tags and classes stay the same but I need my python program to scrape all as I just input my link.

Here\'

2条回答

孤独总比滥情好 (楼主)

2021-02-02 03:17

Have a list of urls and iterate through it.

from bs4 import BeautifulSoup
import requests
import pprint
import re
import pyperclip

urls = ['www.website1.com', 'www.website2.com', 'www.website3.com', .....]
#scrape elements
for url in urls:
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")

    #print titles only
    h1 = soup.find("h1", class_= "class-headline")
    print(h1.get_text())

If you are going to prompt user for input for each site then it can be done this way

from bs4 import BeautifulSoup
import requests
import pprint
import re
import pyperclip

urls = ['www.website1.com', 'www.website2.com', 'www.website3.com', .....]
#scrape elements
msg = 'Enter Url, to exit type q and hit enter.'
url = input(msg)
while(url!='q'):
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")

    #print titles only
    h1 = soup.find("h1", class_= "class-headline")
    print(h1.get_text())
    input(msg)

0 讨论(0)

查看其它2个回答