Webscraping Instagram follower count BeautifulSoup

后端 未结 5 1995
暖寄归人
暖寄归人 2021-01-18 21:12

I\'m just starting to learn how to web scrape using BeautifulSoup and want to write a simple program that will get the follower count for a given Instagram

5条回答
  •  生来不讨喜
    2021-01-18 21:59

    Here is my approach ( the html source code has a json object that has all the data of the profile )

    import json
    import urllib.request, urllib.parse
    from bs4 import BeautifulSoup   
    
    req      = urllib.request.Request(myurl)
    req.add_header('User-Agent','Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36')
    html     = urllib.request.urlopen(req).read()
    response = BeautifulSoup(html, 'html.parser')
    jsonObject = response.select("body > script:nth-of-type(1)")[0].text.replace('window._sharedData =','').replace(';','')
    data      = json.loads(jsonObject)
    following = data['entry_data']['ProfilePage'][0]['graphql']['user']['edge_follow']['count']
    followed  = data['entry_data']['ProfilePage'][0]['graphql']['user']['edge_followed_by']['count']
    posts     = data['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['count']
    username  = data['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges'][0]['node']['owner']['username']
    

提交回复
热议问题