Trying to scrape email address from website

后端 未结 2 1573
不思量自难忘°
不思量自难忘° 2021-01-23 13:02

I was trying to scrape this website:

www.united-church.ca/search/locator/all?keyw=&mission_units_ucc_ministry_type_advanced=10&locll=

I did scrape it using

2条回答
  •  再見小時候
    2021-01-23 13:51

    Using Beautiful Soup

    A simple way to get the email is to look for the div with class=field-name-field-mu-email', and then replace the odd display to a proper email format.

    For instance:

    from bs4 import BeautifulSoup
    url = 'https://www.united-church.ca/search/locator/all?keyw=&mission_units_ucc_ministry_type_advanced=10&locll='
    
    r = requests.get(url)
    soup = BeautifulSoup(r.text, 'html.parser')
    
    for div in soup.findAll('div', attrs={'class': 'field-name-field-mu-email'}):
        print (div.find('span').text.replace(' [at] ', '@'))
    
    Out[1]:
    alpcharge@sasktel.net
    guc-eug@bellnet.ca
    pioneerpastoralcharge@gmail.com
    acmeunitedchurch@gmail.com
    cmcphers@lakeheadu.ca
    mbm@kos.net
    tommaclaren@gmail.com
    agassizunited@shaw.ca
    buchurch@xplornet.com
    dmitchell008@yahoo.ca
    karen.charlie62@gmail.com
    trinityucbdn@westman.wave.ca
    gepc.ucc.mail@gmail.com
    monacampbell181@gmail.com
    herbklaehn@gmail.com
    
    

提交回复
热议问题