Calling git in pre-commit hook

问题

I am getting weird results of running my git pre-commit hook, for example when I do git diff --name-only in terminal it seems to give different result than when it is executed in .git/hooks/pre-commit

So my questions are:

Am I allowed to call git inside git hooks?
If 1. is ok: when exactly is pre-commit hook called if I do git commit -am"bla"? In particular does git do staging first and then it calls the pre-commit hook or not?

I ask this because I tried 2 or 3 times this: I modify a file, I run the script manually, it prints out

#! /bin/sh -xv
files=$(git diff --name-only)
+ git diff --name-only
+ files=path/to/file.h
echo $files
+ echo path/to/file.h
path/to/file.h
...

When I do git commit -am"eh" then the output is different

#! /bin/sh -xv
files=$(git diff --name-only)
+ git diff --name-only
+ files=
echo $files
+ echo

回答1:

Am I allowed to call git inside git hooks?

Yes, but you must exercise caution, as there are a number of things set in the environment and you're working with something that is in the middle of being done:

GIT_DIR is set to the path to the Git directory.
GIT_WORKTREE may be set to the path to the work-tree (from git --work-tree).
Other Git variables, such as GIT_NO_REPLACE_OBJECTS, may be set from the command line as well.

(You should leave these set if you're continuing to work with the current repository, but clear them out if you're working with a different repository.)

If 1. is ok: when exactly is pre-commit hook called if I do git commit -am"bla"? In particular does git do staging first and then it calls the pre-commit hook or not?

This is complicated.

There are three "modes" that git commit uses internally. (There are no promises about this, but that's how things have been implemented for many years now so this three-modes thing seems pretty stable.) The modes are:

git commit without -a, --include, --only, and/or any command-line-specified file names. This is the default or normal mode. The underlying implementation details do not show through.
git commit with -a or with command-line-specified file names. This divides into two sub-modes:
- such a commit with --include, or
- such a commit with --only.
At this point, the underlying implementation shows through.

The underlying implementation details here involve the thing that Git calls, variously, the index, the staging area, and (rarely now) the cache, which is normally implemented as a file named $GIT_DIR/index (where $GIT_DIR is the environment variable from the note about point 1). Normally, there is only one of these: the index. It holds the content that you intend to commit.¹ When you run git commit, Git will package up whatever is in the index as the next commit.

But, during the operation of git commit, there may be up to three index files. For the normal git commit there's just the one index, and your pre-commit hook can use it and can even update it. (I advise against updating it, for reasons we'll see in a moment.)

But, if you do a git commit -a, or git commit --include file.ext, now there are two index files. There's the content that's ready to be committed—the regular index—and one extra index, which is the original index plus the result of doing a git add on file.ext or on all files (the equivalent of git add -u). So now there are two index files.

In this mode, Git leaves the regular index file as the regular index file. This file is in $GIT_DIR/index as usual. The second index file, with the extra added stuff, is in $GIT_DIR/index.lock and the environment variable GIT_INDEX_FILE is set to hold that path. If the commit fails, Git will remove the index.lock file and everything will be as if you had not run git commit at all. If the commit succeeds, Git will rename index.lock to index, releasing the lock and updating the (standard, regular) index all in one motion.

Finally, there's the third mode, which you get when you run git commit --only file.ext for instance. Here, there are now three index files:

$GIT_DIR/index: The standard index, which holds what it usually does.
$GIT_DIR/index.lock: A copy of the standard index to which file.ext has been git add-ed.
$GIT_DIR/indexsuffix: A copy of the HEAD commit² to which file.ext has been git add-ed.

The environment variable GIT_INDEX_PATH points to this third index. If the commit succeeds, Git will rename the index.lock file to index, so that it become the index. If the commit fails, Git will remove the index.lock file, so that the index goes back to the state it had before you started. (And in either case, Git removes the third index, which has now served its purpose.)

Note that from a pre-commit hook, you can detect whether git commit is a standard commit (GIT_INDEX_FILE is unset or set to $GIT_DIR/index) or one of the two special modes. In standard mode, if you want to update the index, you can do so as usual. In the two special modes, you can use git add to modify the file that GIT_INDEX_FILE names, which will modify what goes into the commit; and if you're in the --include style commit, this also modifies what will become the standard index on success. But if you're in the --only mode, modifying the proposed commit doesn't affect the standard index, nor the index.lock that will become the standard index.

To consider a concrete example, suppose the user did:

git add file1 file2

so that the standard index matches HEAD except for file1 and file2. Then the user runs:

git commit --only file3

so that the proposed commit is a copy of HEAD with file3 added, and, if this commit succeeds, Git will replace the standard index with one in which file1, file2, and file3 are all added (but since file3 will match the new HEAD commit, only files 1 and 2 will be modified in the new index).

Now suppose your commit hook runs git add file4 and the process as a whole succeeds (the new commit is made successfully). The git add step will copy the work-tree version of file4 into the temporary index, so that the commit will have both file3 and file4 updated. Then Git will rename the index.lock file, so that file3 will match the new HEAD commit. But file4 in the index.lock was never updated, so it won't match the HEAD commit. It will appear to the user that somehow, file4 got reverted! A git status will show a pending change to it, staged for commit, and git diff --cached will show that the difference between HEAD and the index is that file4 has been changed back to match the file4 in HEAD~1.

You could have your pre-commit hook test for this mode and refuse to git add files when in this mode, to avoid the problem. (Or, you could even sneakily add file4 to index.lock, with a second git add command!) But it's generally better to have your hook just reject the commit, with advice to the user to do any git adds themselves, so that you don't have to know all of these implementation secrets about git commit in the first place.

¹The index holds some extra information as well: cache data about the work-tree. That's why it's sometimes called the cache. These extra copies that I describe here are made by copying the original index, so the extra copies also have the same cache data, except if and when they get updated via git add.

²It's not specified whether Git makes this copy via the internal equivalent of:

TMP=$GIT_DIR/index<digits>
cp $GIT_DIR/index $TMP
GIT_INDEX_FILE=$TMP git reset
GIT_INDEX_FILE=$TMP git add file3

or some other means (e.g., the internal equivalent of git read-tree), but since this particular copy is always just removed at the end of the process, it doesn't matter: any cache information for the work-tree becomes irrelevant.

回答2:

Yes, changes seem to be already cached. Use git diff --cached --name-only to list files about to be committed.

来源：https://stackoverflow.com/questions/59686461/calling-git-in-pre-commit-hook

标签

git

pre-commit-hook