How to crawl a website/extract data into database with python?

后端 未结 4 1321
心在旅途
心在旅途 2021-01-31 00:22

I\'d like to build a webapp to help other students at my university create their schedules. To do that I need to crawl the master schedules (one huge html page) as well as a lin

4条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-01-31 01:09

    I liked using BeatifulSoup for extracting html data

    It's as easy as this:

    from BeautifulSoup import BeautifulSoup 
    import urllib
    
    ur = urllib.urlopen("http://pragprog.com/podcasts/feed.rss")
    soup = BeautifulSoup(ur.read())
    items = soup.findAll('item')
    
    urls = [item.enclosure['url'] for item in items]
    

提交回复
热议问题