What's the proper syntax to follow a link using beautifulsoup & requests in a django app?

前端未结

关注

 1  1853

I asked a question that I don\'t think I was clear on. I have already succesfuly scraped posts from a sites home page. The next step is to follow the link from the post to it\'s

相关标签:

1条回答

广开言路

2021-02-06 19:42

this will work

def sprinkle():
        url_two = 'http://www.vladtv.com'
        html = requests.get(url_two, headers=headers)
        soup = BeautifulSoup(html.text, 'html5lib')
        titles = soup.find_all('div', {'class': 'entry-pos-1'})

        def make_soup(url):
            the_comments_page = requests.get(url, headers=headers)
            soupdata = BeautifulSoup(the_comments_page.text, 'html5lib')
            comment = soupdata.find('div', {'class': 'article-body'})
            para = comment.find_all('p')
            return para

        entries = [{'href': url_two + div.a.get('href'),
                    'src': url_two + div.a.img.get('data-original'),
                    'text': div.find('p', 'entry-title').text,
                    'comments': make_soup(url_two + div.a.get('href'))
                    } for div in titles][:6]

        return entries

but the way I solved it the square brackets still show

0 讨论(0)