I read this question , and now I have this doubt as to how git pull work with refpec :
Step 1 : I am on branchA.
Step 2 : I do `git pull origin branchB:branchC`
Well, after reading @torek-ans-1 and @torek-ans-2 [This is must read to understand the working of git fetch/pull], I feel to post an complete answer to my question for those who want to get it quickly.
First, the steps in the question are wrong. This is the correct steps :
Step 1 : I am on branchA.
Step 2 : I do `git pull origin branchB:branchC` .
Step 3: I notice :
a) commits from branchB on remote comes and update `refs/heads/branchC`
b) Then based on `remote.origin.fetch` was used to try to update `remotes/origin/branchB` on our local.
[ Notice that no attempts will be made to update `remotes/origin/branchC`]
c) The `branchC` was merged into `branchA`.
[Order might vary from one git version to other]
In step a) + step b) , there is no merge. This is called fast forward update. There is something called fast forward merge too which behaves like this but we say fast forward merge when git merge
behaves like a fast forward update.
Here in step a)+ step b) no git merge
is called . Hence, we call it fast forward update and not fast forward merge.
Step c) is where git merge will be called.
In short : git pull origin branchB:branchC= git fetch origin branchB:branchC ((a) + (b))+ git merge branchC (c)
Now my question was why 2 merge called ?
There are not 2 merge . There is only 1 merge in step c). Yes, there are 2 fast forward update and git fetch
does them.
Step 2 is not a true merge, it's a fast-forward merge. Fast-forwarding is the only kind of merge possible for a non-current (i.e., not currently checked out) branch. If fast-forwarding is not possible git
would abort fetch/pull
; in that case you could either do a true merge (checkout branchC and run git pull origin branchB
) or do a forceful update (git fetch origin +branchB:branchC
) thus loosing your local commits at the head of branchC.
phd's answer is correct. Break the git pull
command into its two components:
git fetch origin branchB:branchC
. Run this on the same setup, i.e., with branchC
set to point to the commit it pointed-to before your git pull
command.
git merge <hash-id>
. The actual hash ID is taken from .git/FETCH_HEAD
, where git fetch
leaves it. Run this on the same setup, with branchA
set to point to the commit it pointed-to before your git pull
command.
Note that step 2, the git merge
, has no effect on the reference branchC
. It does have some effect on the current branch name, i.e., refs/heads/branchA
. Since it runs git merge
, it can do a fast-forward merge, or a true merge, or nothing at all.
Let's delve more into the fetch
step, which is really the more interesting, or at least challenging, one.
git ls-remote
Before running git fetch origin branchB:branchC
, run git ls-remote origin
. Here's what I get running it on a Git repository for Git (with a lot of bits snipped):
$ git ls-remote origin
e144d126d74f5d2702870ca9423743102eec6fcd HEAD
468165c1d8a442994a825f3684528361727cd8c0 refs/heads/maint
e144d126d74f5d2702870ca9423743102eec6fcd refs/heads/master
093e983b058373aa293997e097afdae7373d7d53 refs/heads/next
005c16f6a19af11b7251a538cd47037bd1500664 refs/heads/pu
7a516be37f6880caa6a4ed8fe2fe4e8ed51e8cd0 refs/heads/todo
d5aef6e4d58cfe1549adef5b436f3ace984e8c86 refs/tags/gitgui-0.10.0
3d654be48f65545c4d3e35f5d3bbed5489820930 refs/tags/gitgui-0.10.0^{}
...
dcba104ffdcf2f27bc5058d8321e7a6c2fe8f27e refs/tags/v2.9.5
4d4165b80d6b91a255e2847583bd4df98b5d54e1 refs/tags/v2.9.5^{}
You can see that their Git offers, to my Git, a long list of reference names and hash IDs.
My Git can pick through these and choose which name(s) and/or ID(s) it likes, and then go to the next phase of git fetch
: ask them what hash IDs they can give me that go with, e.g., commit e144d126d74f5d2702870ca9423743102eec6fcd
(the hash ID for their master
). My Git would do this if I told it to bring over their master
or their refs/heads/master
as the left hand side of a refspec, since those name-strings match their refs/heads/master
.
(With no refspecs, my Git will ask for all branches. The tags are trickier: --tags
has my Git take all, --no-tags
has my Git take none, but in between, there's some horribly twisty code inside git fetch
.)
In any case, they offer some hashes, my Git says whether it wants or has some other hashes, and their Git uses their git rev-list
to construct a set of hash IDs for commits, trees, blobs, and/or annotated tag objects to put into a so-called thin pack. During this phase of git fetch
you see messages about the remote counting and compressing objects.
git fetch origin
Let me run an actual git fetch
now:
$ git fetch origin
remote: Counting objects: 2146, done.
remote: Compressing objects: 100% (774/774), done.
remote: Total 2146 (delta 1850), reused 1649 (delta 1372)
Eventually, their Git finishes packing all the objects they will send, and sends those objects. My Git receives them:
Receiving objects: 100% (2146/2146), 691.50 KiB | 3.88 MiB/s, done.
My Git fixes up the thin pack (git index-pack --fix-thin
) to make it a viable normal pack that can live in my .git/objects/pack
directory:
Resolving deltas: 100% (1850/1850), completed with 339 local objects.
Finally, the most interesting-to-us parts of the fetch happen:
From [url]
ccdcbd54c..e144d126d master -> origin/master
1526ddbba..093e983b0 next -> origin/next
+ 8b97ca562...005c16f6a pu -> origin/pu (forced update)
7ae8ee0ce..7a516be37 todo -> origin/todo
The names on the left of the ->
arrows are their names; the names on the right are my Git's names. Since I ran only git fetch origin
(with no refspecs), my Git used my default refspecs:
$ git config --get remote.origin.fetch
+refs/heads/*:refs/remotes/origin/*
so it's as if I wrote:
$ git fetch origin '+refs/heads/*:refs/remotes/origin/*'
which uses fully-qualified refspecs, rather than partial names like branchB:branchC
. This particular syntax also uses glob-pattern-like *
characters. Technically these aren't quite globs, as these are just strings and not file names, and there is a *
on the right, but the principle is similar: I ask my Git to match every name starting with refs/heads/
, and copy those to my own repository under names starting with refs/remotes/origin/
.
The refs/heads/
name-space is where all of my Git's branch names reside. The refs/remotes/
name-space is where all of my Git's remote-tracking names reside, and refs/remotes/origin/
is where my Git and I have placed the remote-tracking names that correspond to branch names we found in the Git at origin
. The leading plus sign +
in front sets the force flag, as if I had run git fetch --force
.
The next step requires that we look at the commit graph—the Directed Acyclic Graph or DAG of all commits found in my Git repository. In this case, since the new pack file has been integrated, this includes all the new objects I've just added via git fetch
, so that I have new commits (and any trees and blobs necessary to go with them) obtained from their Git.
Each object has a unique hash ID, but these are too unwieldy to use directly. I like to draw my graphs left-to-right in text on StackOverflow, and use round o
s or single uppercase letters (or both) to denote particular commits. Earlier commits go towards the left, with later commits towards the right, and a branch name points to the tip commit of that branch:
...--o--o--A <-- master
\
o--B <-- develop
Note that in this view of the Git object database, we pay no attention at all to the index / staging-area, and no attention at all to the work-tree. We are concerned only with the commits and their labels.
Since I actually obtained my commits from the Git at origin
, my Git has origin/*
names as well, so let's draw those in:
...--o--o--A <-- master, origin/master
\
o--B <-- develop, origin/develop
Now, suppose that I run git fetch
and it brings in two new commits that I will label C
and D
. C
's parent is A
, and D
's is the node just before B
:
C
/
...--o--o--A <-- master
\
o--B <-- develop
\
D
For my Git to retain these commits, my Git must have some name or names by which it can reach these commits. The name that reaches C
is going to be origin/master
, and the name that reaches D
is going to be origin/develop
. Those names used to point to commits A
and B
respectively, but git fetch origin +refs/heads/*:refs/remotes/origin/*
tells my Git to replace them, giving:
C <-- origin/master
/
...--o--o--A <-- master
\
o--B <-- develop
\
D <-- origin/develop
The output from this git fetch
will list this as:
aaaaaaa..ccccccc master -> origin/master
+ bbbbbbb...ffffdffffdd develop -> origin/develop (forced update)
Note the +
and the three dots in the output here. That's because while moving origin/master
from commit A
(hash ID aaaaaaa
) to commit C
was a fast-forward operation, moving origin/develop
from commit B
to commit D
was not. This required the force flag.
If you run git fetch origin br1:br2
, you instruct your Git to:
origin
(really remote.origin.url
)br1
(probably refs/heads/br1
) to update your br2
—most likely your refs/heads/br2
, bringing over whatever objects are necessary to make this happen.This update phase, updating your br2
based on their br1
, does not have a force flag set on it. This means that your Git will permit the change if and only if the operation is a fast-forward.
(Meanwhile, your Git will also update your origin/br1
, because Git does this kind of opportunistic update based on remote.origin.fetch
. Note that this update does have the force flag set, assuming a standard remote.origin.fetch
configuration.)
We (and Git) talk about doing a fast-forward merge, but this is a misnomer, for two reasons. The first and most important is that fast-forward is a property of a label's motion. Given some existing reference label (branch, tag, or whatever) R that points to some commit C1
, we tell Git: move R to point to commit C2
instead. Assuming both hash IDs are valid and point to commits, when we examine the commit DAG, we will find that:
C1
is an ancestor of C2
. This change to R is a fast-forward.C1
is not an ancestor of C2
. This change to R is a non-fast-forward.The special property of a fast-forward operation is that now that R points to C2
, if we start at C2
and work backwards as Git always does, we will eventually come across C1
. So C1
remains protected by a name, and if R is a branch name, commit C1
is still on branch R. If the operation is not a fast-forward, C1
is not reachable from C2
, and C1
may no longer be protected and may—depending on whether anything else protects it, and its relative age—be garbage collected at some point in the future.
Because of the above, updating a branch style reference—a branch name in refs/heads/
or a remote-tracking name in refs/remotes/
—often requires using a force flag, if the update is not a fast-forward. Different parts of Git implement this differently: git fetch
and git push
both have --force
and leading-plus-sign, while other Git commands (that don't have refspecs) just have --force
or, as in the case of git reset
, just assume that you—the user—know what you are doing.
(Very old versions of Git, 1.8.2 and older, accidentally applied these fast-forward rules to tag names as well as branch names.)
git merge
command knows about the index and work-treeWhat makes a git merge
fast-forward merge operation different—well, at least slightly different—from this kind of label fast-forwarding is that git merge
knows about, and works with, your index / staging-area and your work-tree. When you run:
git merge <commit-specifier>
Git computes the merge base of the current HEAD commit and the given other commit. If this merge base is the current commit, the operation can be done as a fast-forward label move, as long as Git also brings the index and work-tree along with it.
If the merge base is an ancestor of the current commit, or if you use the --no-ff
flag, git merge
must perform a true merge, and make a new merge commit. (Of course there are also flags to suppress the commit and to make the new commit as an ordinary, non-merge commit, so this view of git merge
skips a few important details as well.)