I have a project versioned with Git that I\'d like to make open source, but it has some private information in it that is specific to the environment in which it was originally
I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch
specifically designed for removing private data from Git repos.
The usage instructions give the steps in more detail, but the core bit is just: download the BFG's jar (needs Java 6 or above) and run this command:
$ java -jar bfg.jar --replace-text replacements.txt my-repo.git
The replacements.txt
file should contain all the substitutions you want to do, in a format like this (one entry per line - note the comments shouldn't be included):
PASSWORD1 # Replace literal string 'PASSWORD1' with '***REMOVED***' (default)
PASSWORD2==>examplePass # replace with 'examplePass' instead
PASSWORD3==> # replace with the empty string
regex:password=\w+==>password= # Replace, using a regex
Your entire repository history will be scanned, and all non-binary files (under 1MB in size) will have the substitutions performed: any matching string (that isn't in your latest commit) will be replaced.
Full disclosure: I'm the author of the BFG Repo-Cleaner.
I wrote a script for this a little while ago. You can find it here: https://gist.github.com/dound/76ea685c05c4a7895247457eb676fe69
(original writeup viewable from archive.org: https://web.archive.org/web/20160208235904/http://dound.com:80/2009/04/git-forever-remove-files-or-folders-from-history/)
The script builds on the git-filter-branch tool which comes with git. If you're curious, you can read more about removing files from a git repo here, but using the script from the link above should be easy and all you really need to accomplish removing that private information.