How can I list all versions of all files in a git repository?
(For example for listing all files that ever contained a certain string)
This list could be use
As I understand it from the manual, the following lists all objects and their info
git cat-file --batch-all-objects --batch-check
First of all, there's very little chance you want to do this by listing blobs. A blob is just raw data; it doesn't know what file it's part of. The true answer depends a little bit on what exactly you're trying to accomplish. For example, do you need to search blobs that are part of commits which aren't even accessible from the commit history? If you don't, here are a couple thoughts.
Perhaps the pickaxe search of git-log
would do what you want:
-S<string>
Look for differences that introduce or remove an instance of<string>
. Note that this is different than the string simply appearing in diff output; see the pickaxe entry in gitdiffcore(7) for more details.
Depending on your end goal, this might be way better than what you suggested - you'll actually see how the string was added or removed. You can of course use the information you get to cat the entire file, if you so desire.
Or maybe you want to list revisions with git-log
and use git-grep
on the trees (commits) it provides?
This is how I get a list of SHAs and filenames for all the blobs in a repository:
$ git rev-list --objects --all | git cat-file --batch-check='%(objectname) %(objecttype) %(rest)' | grep '^[^ ]* blob' | cut -d" " -f1,3-
Notes:
The %(rest)
atom in the format string appends the rest of the input line after the object's SHA to the output. In this case, this rest happens to be the path name (for tree and blob objects).
The grep
pattern is intended to match only actual blobs, not tree objects which just happen to have the string blob
somewhere in their path name.