Just say I have a file: \"HelloWorld.pm\" in multiple subdirectories within a Git repository.
I would like to issue a command to find the full paths of all the files
Try:
git ls-tree -r HEAD | grep HelloWorld.pm
The script by Uwe Geuder (@uwe-geuder) is great but there really is no need to dump each of the ls-tree outputs in its own directory, unfiltered.
Much faster and using less storage: Run the grep on the output and then store it, as shown in this gist
Hmm, the original question was about the repository. A repository contains more than 1 commit (in the general case at least), but the answers given before search only through one commit.
Because I could not find an answer that really searches the whole commit history I wrote a quick brute force script git-find-by-name that takes (nearly) all commits into consideration.
#! /bin/sh
tmpdir=$(mktemp -td git-find.XXXX)
trap "rm -r $tmpdir" EXIT INT TERM
allrevs=$(git rev-list --all)
# well, nearly all revs, we could still check the log if we have
# dangling commits and we could include the index to be perfect...
for rev in $allrevs
do
git ls-tree --full-tree -r $rev >$tmpdir/$rev
done
cd $tmpdir
grep $1 *
Maybe there is a more elegant way.
Please note the trivial way the parameter is passed into grep, so it will match parts of filename. If that is not desired anchor your search expression and/or add suitable grep options.
For deep histories the output might be too noisy, I thought about a script that converts a list of revisions into a range, like the opposite of what git rev-list can do. But so far it has remained a thought.
[It's a bit of comment abuse, I admit, but I can't comment yet and thought I would improve @uwe-geuder's answer.]
#!/bin/bash
#
#
# I'm using a fixed string here, not a regular expression, but you can easily
# use a regular expression by altering the call to grep below.
name="$1"
# Verify usage.
if [[ -z "$name" ]]
then
echo "Usage: $(basename "$0") <file name>" 1>&2
exit 100
fi
# Search all revisions; get unique results.
while IFS= read rev
do
# Find $name in $rev's tree and only use its path.
grep -F -- "$name" \
<(git ls-tree --full-tree -r "$rev" | awk '{ print $4 }')
done < \
<(git rev-list --all) \
| sort -u
Again, +1 to @uwe-geuder for a great answer.
If you're interested in the BASH itself:
Unless you're guaranteed of the word-splitting in a for loop (as when using an array like this: for item in "${array[@]}"
), I highly recommend using while IFS= read var ; do ... ; done < <(command)
when the command output you're looping over is separated by newlines (or read -d''
when output is separated by the null string $'\0'
). While git rev-list --all
is guaranteed to use 40-byte hexadecimal strings (without spaces), I never like to take chances. I can now easily change the command from git rev-list --all
to any command that produces lines
I also recommend using built-in BASH mechanisms to inject input and filter output instead of temporary files.
git ls-files will give you a listing of all files in current state of the repository (the cache or index). You can pass a pattern in to get files matching that pattern.
git ls-files HelloWorld.pm '**/HelloWorld.pm'
If you would like to find a set of files and grep through their contents, you can do that with git grep:
git grep some-string -- HelloWorld.pm '**/HelloWorld.pm'
git ls-files | grep -i HelloWorld.pm
The grep -i makes grep case insensitive.