How to use GitHub API to get a repository's dependents information in GitHub?

江枫思渺然 提交于 2020-12-02 07:12:58

问题


When I was using GitHub API v4 to get some information, I can easily get dependencies by using repository.dependencyGraphManifests. But I can't find any way to use GitHub API v4 to get the dependents information, though I can see it in the Insights->Dependency Graph->Dependents. I want to know if there is any possible way to get the dependents information in a GitHub repository? Whether GitHub API or something else.


回答1:


I don't think you can get the dependents project using Github API (Rest or Graphql), one way could be to use scraping like the following python script :

import requests
from bs4 import BeautifulSoup

repo = "expressjs/express"
page_num = 3
url = 'https://github.com/{}/network/dependents'.format(repo)

for i in range(page_num):
    print("GET " + url)
    r = requests.get(url)
    soup = BeautifulSoup(r.content, "html.parser")

    data = [
        "{}/{}".format(
            t.find('a', {"data-repository-hovercards-enabled":""}).text,
            t.find('a', {"data-hovercard-type":"repository"}).text
        )
        for t in soup.findAll("div", {"class": "Box-row"})
    ]

    print(data)
    print(len(data))
    paginationContainer = soup.find("div", {"class":"paginate-container"}).find('a')
    if paginationContainer:
        url = paginationContainer["href"]
    else:
        break

Try this python script




回答2:


Based on Bertrand Martel's answer (@bertrand-martel), do not forget to add the following code so that you are not stuck between 1st and 2nd pages. In other words, it will be going forward, and then backward; because there is initially only one <a> tag, whereas the next page has two of these, so it chooses 1st one ("previous") and returns to the previous page.

Code:

...
    paginationContainer = soup.find("div", {"class":"paginate-container"}).find_all('a')
    if len(paginationContainer) > 1:
        paginationContainer = paginationContainer[1]
    else:
        paginationContainer = paginationContainer[0]
...


来源:https://stackoverflow.com/questions/58734176/how-to-use-github-api-to-get-a-repositorys-dependents-information-in-github

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!