I am developing a web application which needs to send a lot of HTTP requests to GitHub. After n number of successful requests, I get HTTP 403: Forbidden
with the me
I have observed this error during multibrnach pipeline configuration in jenkins
I had selected the source as github. After changing it to git and passing guthub repo details it worked. (have git executable path configured in jenkins and have a credential set for authentication to github)
In order to increase the API rate limit you might
authenticate yourself at Github via your OAuth2 token or
use a key/secret to increase the unauthenticated rate limit.
There are multiple ways of doing this:
Basic Auth + OAuth2Token
curl -u <token>:x-oauth-basic https://api.github.com/user
Set and Send OAuth2Token in Header
curl -H "Authorization: token OAUTH-TOKEN" https://api.github.com
Set and Send OAuth2Token as URL Parameter
curl https://api.github.com/?access_token=OAUTH-TOKEN
Set Key & Secret for Server-2-Server communication
curl 'https://api.github.com/users/whatever?client_id=xxxx&client_secret=yyyy'
Just make new "Personal Access Token" here and use simple fetch method (if you are coding in JS of course :D) and replace YOUR_ACCESS_TOKEN with your token.
The best way to test it is to use Postman
async function fetchGH() {
const response = await fetch('https://api.github.com/repos/facebook/react/issues', {
headers: {
'Authorization': 'token YOUR_ACCESS_TOKEN',
}
})
return await response.json()
}
Solution: Add authentication details or the client ID and secret (generated when you register your application on GitHub).
Found details here and here
"If you need to make unauthenticated calls but need to use a higher rate limit associated with your OAuth application, you can send over your client ID and secret in the query string"
This is a relative solution, because the limit is still 5000 API calls per hour, or ~80 calls per minute, which is really not that much.
I am writing a tool to compare over 350 repositories in an organization and to find their correlations. Ok, the tool uses python for git/github access, but I think that is not the relevant point, here.
After some initial success, I found out that the capabilities of the GitHub API are too limited in # of calls and also in bandwidth, if you really want to ask the repos a lot of deep questions.
Therefore, I switched the concept, using a different approach:
Instead of doing everything with the GitHub API, I wrote a GitHub Mirror script that is able to mirror all of those repos in less than 15 minutes using my parallel python script via pygit2.
Then, I wrote everything possible using the local repositories and pygit2. This solution became faster by a factor of 100 or more, because there was neither an API nor a bandwidth bottle neck.
Of course, this did cost extra effort, because the pygit2 API is quite a bit different from github3.py that I preferred for the GitHub solution part.
And that is actually my conclusion/advice: The most efficient way to work with lots of Git data is:
clone all repos you are interested in, locally
write everything possible using pygit2, locally
write other things, like public/private info, pull requests, access to wiki pages, issues etc. using the github3.py API or what you prefer.
This way, you can maximize your throughput, while your limitation is now the quality of your program. (also non-trivial)