Git: What does EXACTLY “git pull” do?

后端 未结 4 1433
情歌与酒
情歌与酒 2020-12-09 23:44

I know that the \"git pull\" is actually a combination of \"git fetch\" and \"git merge\" and that basically it brings the repository as it it in remote repository.

相关标签:
4条回答
  • 2020-12-10 00:22

    The exactly part is really quite tough. It's often said—and it's mostly true—that git pull runs git fetch followed by either git merge or git rebase, and in fact, git pull, which used to be a shell script and is now a C program, quite literally ran git fetch first, though now it directly invokes the C code that implements git fetch.

    The next step, however, is quite tricky. Also, in a comment, you added this:

    [fetch] brings changes from the remote repo. Where does it put them?

    To understand this properly, you must understand Git's object system.

    The Git object model, and git fetch

    Each commit is a sort of standalone entity. Every commit has a unique hash ID: b06d364... or whatever. That hash ID is a cryptographic checksum of the contents of that commit. Consider, for instance:

    $ git cat-file -p HEAD | sed 's/@/ /g'
    tree a15b54eb544033f8c1ad04dd0a5278a59cc36cc9
    parent 951ea7656ebb3f30e6c5e941e625a1318ac58298
    author Junio C Hamano <gitster pobox.com> 1494339962 +0900
    committer Junio C Hamano <gitster pobox.com> 1494339962 +0900
    
    Git 2.13
    
    Signed-off-by: Junio C Hamano <gitster pobox.com>
    

    If you feed these contents (minus the 's/@/ /' part but with the header that Git adds to every object) to a SHA-1 checksum calculator, you will get the hash ID. This means that everyone who has this commit has the same hash ID for it.

    You can get the Git repository for Git and run git cat-file -p v2.13.0^{commit} to see this same data. Note: the tag v2.13.0 translates to 074ffb61b4b507b3bde7dcf6006e5660a0430860, which is a tag object; the tag object itself refers to the commit b06d364...:

    $ git cat-file -p v2.13.0
    object b06d3643105c8758ed019125a4399cb7efdcce2c
    type commit
    tag v2.13.0
    [snip]
    

    To work with a commit, Git must store the commit object—the item with the hash ID b06d364...—itself somewhere, and also its tree object and any additional objects that tree needs. These are the objects that you see Git counting and compressing during a git fetch or git push.

    The parent line tells which commit (or, for a merge, commits, plural) are the predecessors of this particular commit. To have a complete set of commits, Git must also have the parent commit(s) (a --shallow clone can deliberately omit various parents, whose IDs are recorded in a special file of "shallow grafts", but a normal clone will always have everything).

    There are four types of object in total: commits, (annotated) tags, trees, and what Git calls blob objects. Blobs mostly store the actual files. All of these objects reside in Git's object database. Git can then retrieve them easily by hash ID: git cat-file -p <hash>, for instance, displays them in a vaguely human-readable format. (Most of the time there is little that must be done other than de-compressing, though tree objects have binary data that must be formatted first.)

    When you run git fetch—or have git pull run it for you—your Git obtains the hash IDs of some initial objects from another Git, then uses the Git transfer protocols to figure out what additional objects are required to complete your Git repository. If you already have some object, you do not need to fetch it again, and if that object is a commit object, you do not need any of its parents either.1 So you get only the commits (and trees and blobs) that you do not already have. Your Git then stuffs these into your repository's object database.

    Once the objects are safely saved away, your Git records the hash IDs in the special FETCH_HEAD file. If your Git is at least 1.8.4, it will also update any corresponding remote-tracking branch names at this time: e.g., it may update your origin/master.

    (If you run git fetch manually, your Git obeys all the normal refspec update rules, as described in the git fetch documentation. It's the additional arguments passed to git fetch by git pull that inhibit some of these, depending on your Git version.)

    That, then, is the answer to what I think is your real first question: git fetch stores these objects in Git's object database, where they may be retrieved by their hash IDs. It adds the hash IDs to .git/FETCH_HEAD (always), and often also updates some of your references—tag names in refs/tags/, and remote-tracking branch names in refs/remotes/.


    1Except, that is, to "unshallow" a shallow clone.


    The rest of git pull

    Running git fetch gets you objects, but does nothing to incorporate those objects into any of your work. If you wish to use the fetched commits or other data, you need a second step.

    The two main actions you can do here are git merge or git rebase. The best way to understand them is to read about them elsewhere (other SO postings, other documentation, and so on). Both are, however, complicated commands—and there is one special case for git pull that is not covered by those two: in particular, you can git pull into a non-existent branch. You have a non-existent branch (which Git also calls an orphan branch or an unborn branch) in two cases:

    • in a new, empty repository (that has no commits), or
    • after running git checkout --orphan newbranch

    In both cases, there is no current commit so there is nothing to rebase or merge. However, the index and/or work-tree are not necessarily empty! They are initially empty in a new, empty repository, but by the time you run git pull you could have created files and copied them into the index.

    This kind of git pull has traditionally been buggy, so be careful: versions of Git before 1.8-ish will sometimes destroy uncommitted work. I think it's best to avoid git pull entirely here: just run git fetch yourself, and then figure out what you want to do. As far as I know, it's OK in modern Git—these versions will not destroy your index and work-tree—but I am in the habit of avoiding git pull myself.

    In any case, even if you are not on an orphan/unborn/non-existent branch, it's not a great idea to try to run git merge with a dirty index and/or work-tree ("uncommitted work"). The git rebase command now has an automatic-stash option (rebase.autoStash), so you can have Git automatically run git stash save to create some off-branch commits out of any such uncommitted work. Then the rebase itself can run, after which Git can automatically apply and drop the stash.

    The git merge command does not have this automatic option, but of course you can do it manually.

    Note that none of this works if you are in the middle of a conflicted merge. In this state, the index has extra entries: you cannot commit these until you resolve the conflicts, and you cannot even stash them (which follows naturally from the fact that git stash really makes commits). You can run git fetch, at any time, since that just adds new objects to the object database; but you cannot merge or rebase when the index is in this state.

    0 讨论(0)
  • 2020-12-10 00:29

    Obtain the las changes of your remote repo, in relation of the branch that you are using in this moment

    0 讨论(0)
  • 2020-12-10 00:38
    1. No :
      if you have local commits, which you haven't pushed yet, or some indexed changes (git added), you will still have those local changes on top of the last public commit (or merged with the last public commit) ;

    2. Yes :
      if nothing was pushed to the remote repo since your last git pull, you are already up to date, so nothing will change ;

    3. No :
      if you see changes in the index after a git pull, the files were already indexed before you ran git pull ;

    4. git already will, with the following caveat : if one of your indexed files should be updated by the merge, git will not perform the merge, and print a message :

      error: Your local changes to the following files would be overwritten 
      by merge:
          bb
      Please commit your changes or stash them before you merge.
      Aborting
      

      In that case : you should probably create a commit from your index, and run git merge origin/current/branch (or git rebase origin/current/branch) to incorporate the remote modifications with your local modifications.


    The default behavior of git fetch [origin] is to read all branches stored in the remote repo, and update all local refs stored under refs/remotes/[origin]/*.

    You can then use origin/branch/name as a valid tree-ish name in all standard git commands :

    # difference with remote "master" branch :
    $ git diff HEAD origin/master
    
    # history of remote branch "feature" alongside your local branch "feature" :
    $ git log --oneline --graphe feature origin/feature
    
    # merge changes from remote "master" :
    $ git merge origin/master
    
    # rebase your local commits on top of remote "develop" branch :
    $ git rebase origin/develop
    
    # etc ...
    

    You also have a shortcut to say "the remote branch linked to my active branch" : @{u}

    $ git diff @{u}
    $ git log --oneline --graph HEAD @{u}
    $ git merge @{u}
    $ git rebase @{u}
    # etc ...
    

    What does EXACTLY "git pull" do ?

    Right after a git fetch, git updates a special ref, named FETCH_HEAD, which generally match the @{u} of the active branch ;
    git pull does git fetch && git merge FETH_HEAD.

    I tried to explain git fetch in my own words in the paragraph above.

    0 讨论(0)
  • 2020-12-10 00:39
    1. But still, does it mean that after "git pull" my working tree will be identical to the remote repo?

    Not necessarily. Any local commits you have on the branch you're pulling will be merged with the changes upstream. Use git pull --rebase to put your local changes on top of the upstream commits. You can get some pretty funky merge paths without --rebase.

    1. I found some cases that doing "git pull" doesn't change anything in my local repo or create any new commit?

    If there's no new commits upstream, nothing will change in your local copy either.

    1. Does it make sense that "git pull" makes changes at the index only?

    Not that I know of. Perhaps if it fails to merge with your local commits, but then you should at least get some errors along the way.

    1. If it does, how can I make the changes at index move forward to the work tree?

    git pull :) Or git rebase <upstream> <branchname>. This will rebase the local commits in your branch <branchname> on top of the upstream commits in that branch.

    0 讨论(0)
提交回复
热议问题