I would like to put a Git project on GitHub but it contains certain files with sensitive data (usernames and passwords, like /config/deploy.rb for capistrano).
I kno
If you pushed to GitHub, force pushing is not enough, delete the repository or contact support
Even if you force push one second afterwards, it is not enough as explained below.
The only valid courses of action are:
is what leaked a changeable credential like a password?
yes: modify your passwords immediately, and consider using more OAuth and API keys!
no (naked pics):
do you care if all issues in the repository get nuked?
no: delete the repository
yes:
Force pushing a second later is not enough because:
GitHub keeps dangling commits for a long time.
GitHub staff does have the power to delete such dangling commits if you contact them however.
I experienced this first hand when I uploaded all GitHub commit emails to a repo they asked me to take it down, so I did, and they did a gc
. Pull requests that contain the data have to be deleted however: that repo data remained accessible up to one year after initial takedown due to this.
Dangling commits can be seen either through:
One convenient way to get the source at that commit then is to use the download zip method, which can accept any reference, e.g.: https://github.com/cirosantilli/myrepo/archive/SHA.zip
It is possible to fetch the missing SHAs either by:
type": "PushEvent"
. E.g. mine: https://api.github.com/users/cirosantilli/events/public (Wayback machine)There are scrappers like http://ghtorrent.org/ and https://www.githubarchive.org/ that regularly pool GitHub data and store it elsewhere.
I could not find if they scrape the actual commit diff, and that is unlikely because there would be too much data, but it is technically possible, and the NSA and friends likely have filters to archive only stuff linked to people or commits of interest.
If you delete the repository instead of just force pushing however, commits do disappear even from the API immediately and give 404, e.g. https://api.github.com/repos/cirosantilli/test-dangling-delete/commits/8c08448b5fbf0f891696819f3b2b2d653f7a3824 This works even if you recreate another repository with the same name.
To test this out, I have created a repo: https://github.com/cirosantilli/test-dangling and did:
git init
git remote add origin git@github.com:cirosantilli/test-dangling.git
touch a
git add .
git commit -m 0
git push
touch b
git add .
git commit -m 1
git push
touch c
git rm b
git add .
git commit --amend --no-edit
git push -f
See also: How to remove a dangling commit from GitHub?
git filter-repo
is now officially recommended over git filter-branch
This is mentioned in the manpage of git filter-branch
in Git 2.5 itself.
With git filter repo, you could either remove certain files with: Remove folder and its contents from git/GitHub's history
pip install git-filter-repo
git filter-repo --path path/to/remove1 --path path/to/remove2 --invert-paths
This automatically removes empty commits.
Or you can replace certain strings with: How to replace a string in a whole Git history?
git filter-repo --replace-text <(echo 'my_password==>xxxxxxxx')