What are the differences between git clone --shared and --reference?

前端 未结 3 1901
谎友^
谎友^ 2020-12-13 12:59

After reading the documentation, I still don\'t really understand what the differences are between --shared and --reference . They seem

相关标签:
3条回答
  • 2020-12-13 13:39

    Both options update .git/objects/info/alternates to point to the source repository, which could be dangerous hence the warning note is present on both options in documentation.

    The --shared option does not copy the objects into the clone. This is the main difference.

    The --reference uses an additional repository parameter. Using --reference still copies the objects into destination during the clone, however you are specifying objects be copied from an existing source when they are already available in the reference repository. This can reduce network time and IO from the source repository by passing the path to a repository on a faster/local device using --reference

    See for yourself

    Create a --shared clone and a --reference clone. Count the objects in each using git count-objects -v. You'll notice the shared clone has no objects, and the reference clone has the same number of objects as the source. Further, notice the size difference of each in your file system. If you were to move the source, and test git log in both shared and reference repositories, the log is unavailable in the shared clone, but works fine in the reference clone.

    0 讨论(0)
  • 2020-12-13 13:42

    The link in the comments to your question is really a clearer answer: --reference implies --shared. The point of --reference is to optimise network I/O during the initial clone of a remote repository.

    Contrary to the answer above, I find that the --shared and --reference repositories -- from the same source -- have the same size and both have zero objects. Of course, if you use --reference for some other repository which is based off a common source, the size and objects will reflect the difference between the repositories. Note that in both cases we are not saving space in the work tree, only the .git/objects.

    There is some nuance to maintaining this setup going forward - read the thread for more details. Essentially it sounds like the two should be treated as public repositories, with care around history re-writing in the presence of repacking/pruning/garbage collection.

    The workflow around maintaining an optimal disk-space usage after the initial clone seems to be:

    1. pull source
    2. repack source
    3. pull secondary
    4. git gc in secondary

    Probably best to read the discussion in that thread though.

    You can add an alternate to an existing repository by putting the absolute path to the source's objects directory into secondary/.git/objects/info/alternates and running git gc (many people use git repack -a -d -l, which is done by git gc).

    You can remove an alternate by running git repack -a -d (no -l) in the secondary and then removing the line from the alternates file. As described in the thread, it is possible to have more than one alternate.

    I've not used this much myself, so I don't know how error-prone it is to manage.

    0 讨论(0)
  • 2020-12-13 13:49

    The link in the comments to your question is now dead.

    https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.html has some great information on the subject. Here is some of what is there:

    first, we make a bare clone of the remote repository, to be shared locally as a reference repository (hence named “refrep”):
    $ git clone --bare http://foo/bar.git refrep

    Then, we clone the remote again, but this time giving refrep as a reference:
    $ git clone --reference refrep http://foo/bar.git

    The key difference between this and the --shared option is that you are still tracking the remote repository, not the refrep clone. When you pull, you still contact http://foo/, but you don’t need to wait for it to send any objects that are already stored locally in refrep; when you push, you are updating the branches and other refs of the foo repository directly.

    Of course, as soon as you and others start pushing new commits, the reference repository will become out of date, and you’ll start to lose some of the benefit. Periodically, you can run git fetch --all in refrep to pull in any new objects. A single reference repository can be a cache for the objects of any number of others; just add them as remotes in the reference:

    $ git remote add zeus http://olympus/zeus.git
    $ git fetch --all zeus

    0 讨论(0)
提交回复
热议问题