What's the proper syntax to follow a link using beautifulsoup & requests in a django app?

前端 未结 1 1852
长情又很酷
长情又很酷 2021-02-06 19:34

I asked a question that I don\'t think I was clear on. I have already succesfuly scraped posts from a sites home page. The next step is to follow the link from the post to it\'s

1条回答
  •  广开言路
    2021-02-06 19:42

    this will work

    def sprinkle():
            url_two = 'http://www.vladtv.com'
            html = requests.get(url_two, headers=headers)
            soup = BeautifulSoup(html.text, 'html5lib')
            titles = soup.find_all('div', {'class': 'entry-pos-1'})
    
            def make_soup(url):
                the_comments_page = requests.get(url, headers=headers)
                soupdata = BeautifulSoup(the_comments_page.text, 'html5lib')
                comment = soupdata.find('div', {'class': 'article-body'})
                para = comment.find_all('p')
                return para
    
            entries = [{'href': url_two + div.a.get('href'),
                        'src': url_two + div.a.img.get('data-original'),
                        'text': div.find('p', 'entry-title').text,
                        'comments': make_soup(url_two + div.a.get('href'))
                        } for div in titles][:6]
    
            return entries
    

    but the way I solved it the square brackets still show

    0 讨论(0)
提交回复
热议问题