问题
I have a patch looks like
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/x c/xc.c
15 index e220f68..e611b24 100644
16 --- a/tools/python/xen/lowlevel/xc/xc.c
17 +++ b/tools/python/xen/lowlevel/xc/xc.c
18 @@ -228,6 +228,7 @@ static PyObject *pyxc_vcpu_setaffinity(XcObject *self,
19 int vcpu = 0, i;
20 xc_cpumap_t cpumap;
21 PyObject *cpulist = NULL;
And I want to know which commit
generates the patch, and how to parse 15 index e220f68..e611b24 100644
in the patch?
回答1:
Let's take a look at output from git show
. (This is actual output from a real repo, although I'll snip most bits.)
$ git show d362e62
commit d362e62490dd7f59c170a0a050a203fa0eda9f5a
[snip]
diff --git a/fmt.py b/fmt.py
index c44c267..ba772ee 100755
[snip]
Here, d362e62
is the "short version" of the true name of the commit, i.e., its SHA-1. The "long" form is the full 40-character version, which is the first line of git show
output.
Besides the commit text, the commit itself contains a "tree" (and zero or more "parents"). We can see this with git cat-file -p
:
$ git cat-file -p d362e62
tree 0b9bebfee8890b242875af0df209fd9f335bf14d
parent 41f3a6bcba1f5f7059133f862727809f49ff4657
[snip author, committer, and commit text]
We can look at the "tree" as well. I could use the "true name" SHA-1 above, but here I use a bit of git syntax: a commit identifier followed by ^{tree}
tells git to extract the tree ID from the commit ID.
$ git cat-file -p d362e62^{tree}
[snip]
120000 blob 7417b50d02819bbebeac0f4104850549935f7089 fmt
100755 blob ba772eeb6139de5a724d67d18ce01bfccaf57590 fmt.py
[snip]
I left in the line for fmt
as it is a symlink to fmt.py
. The symlink has mode 120000
, which tells git that the blob
data is actually the target of the symlink. The file, fmt.py
, has mode 100755
, which tells git that it's an ordinary file and that it is executable (it's a Python script). This is the source of the 100644
or 100755
you see in the index
line.
The "true name" of the blob (file object) in the git repo is that 40-character SHA-1. The 7-character abbreviated version for fmt.py
is ba772ee
. This is the second number in the two ..
-separated numbers on the index
line.
The first number on that line is the "true name" in the git repo of the previous version of the file, i.e., the version of fmt.py
that was in the repo before I created commit d362e62
.
We can use another bit of special git syntax to see these as well.1 As documented in gitrevisions, following a commit-specifier with a hat character (circumflex, up-arrow, whatever you like to call it) ^
tells git to find the first parent of that commit. So:
$ git rev-parse d362e62^
41f3a6bcba1f5f7059133f862727809f49ff4657
tells us that the commit before the commit I gave to git show
is the one named 41f3a6b...
. And, sure enough, if we git cat-file -p
that, we get another commit with another tree, and if we git cat-file
that tree-ID and look for fmt.py
we will find another blob
with another SHA-1:
$ git cat-file -p 41f3a6b
tree cbfb63beec96eebf0c73ba6a501cc8151adfec8a
parent 80eeb496ea3f538aa14acdc6b0815024a5e99c7e
[snip]
$ git cat-file -p cbfb63beec96eebf0c73ba6a501cc8151adfec8a | grep fmt.py
100755 blob c44c267c4603838ac7a54aa450b33d0dd7a8bebc fmt.py
$
And there it is: cc4c267
is the abbreviated form of the "true name" of the file stored in the previous commit. This is the first number in the index
line.
I wrote this all out in long form to illustrate how git gets from "point A" to "point B". But, just as with the short-hand syntax d362e62^{tree}
, there is a very easy way to get the blob SHA-1 values using git rev-parse
:
$ git rev-parse d362e62:fmt.py
ba772eeb6139de5a724d67d18ce01bfccaf57590
$ git rev-parse d362e62^:fmt.py
c44c267c4603838ac7a54aa450b33d0dd7a8bebc
If you want the shortened versions, use git rev-parse --short
to truncate the SHA-1 values to (normally) 7 characters.
So:
And I want to know which
commit
generates the patch, and how to parse15 index e220f68..e611b24 100644
in the patch?
The 15
is a line number you (or someone somewhere) added, and now you know what the rest of the values on the index
line are. But to find the commit—well, that's the hard part. The commit is what finds the other values. There is no link from "other values" back to "commit": the "arrows", as it were, only point from commits to trees, and then from trees to blobs. There are no pointers from blobs to trees, nor from trees to commits.
Git always starts with some sort of externally specified name. Usually this is a branch name or tag, or a "symbolic reference" (as HEAD
normally is, when you don't have a "detached head"). The reference locates a commit.2 If the reference is a branch name, that commit is the "tip" of that branch.3 If it's a tag, it still finds a commit. If it's HEAD
, and HEAD
is the name of a branch like master
, git just turns HEAD
into master
and then turns master
into a commit. In other words, the commit is where you start, usually by going from name to commit-ID—but you can almost always specify a "raw" SHA-1 ID here.
Once git has a commit-ID, that commit identifies more commits (its parents) and a tree. The tree identifies sub-trees if needed, and the tree and its sub-trees identify blobs. Starting from all the commits that have "external names", git eventually finds all trees and all blobs—and any trees or blobs in the repository that are not found this way are eligible for garbage-collection, when you run git gc
(or when git gc
runs automatically). (This is how deleted branches, and any number of special temporary files that git creates internally, are cleaned-up later.)
1Git has a lot of special syntax. The most useful ones to memorize, in my opinion:
- hat after thing = parent:
master^
= "parent ofmaster
" - tilde and number
N
after thing = back upN
parents:master~2
= "grandparent ofmaster
" X..Y
= "all revisions selected byY
, excluding all revisions selected byX
":git log master..devel
= "log all commits on branchdevel
that are not onmaster
"
The ..
syntax is also used in git diff
, but here instead of "stuff on Y
that's not on X
", you get a direct comparison of the version associated with X
against the version associated with Y
.
2I'm deliberately skipping over "annotated tags", which also have repository entities. In some cases git will access the tag object, and in others—when it needs a commit, tree, and/or blob—git will automatically follow the annotated tag. Internally, an annotated tag looks very similar to a commit, except that instead of a tree and parents, it has a reference to another git repository object—usually directly to a commit, but sometimes to another tag, and in theory you can make an annotated tag for a tree or a blob, skipping over the commit part entirely.
3A branch name always points to the tip of its own branch, but that branch may be just a part of another branch. For instance, suppose you have a nice linear sequence of commits:
...<-- C3 <-- C4 <-- C5 <-- C6 <-- C7
where C7 has C6 as its parent, C6 has C5, and so on. If branch label X
is a reference to commit C5
, then branch X ends at C5. If branch label Y
points to C7, branch Y ends at C7. In this case branch Y "contains" branch X, but not vice versa.
来源:https://stackoverflow.com/questions/20212649/how-to-read-index-diff-git-output