After reading the documentation, I still don\'t really understand what the
differences are between --shared
and --reference
. They seem
Both options update .git/objects/info/alternates
to point to the source repository, which could be dangerous hence the warning note is present on both options in documentation.
The --shared
option does not copy the objects into the clone. This is the main difference.
The --reference
uses an additional repository parameter. Using --reference
still copies the objects into destination during the clone, however you are specifying objects be copied from an existing source when they are already available in the reference repository. This can reduce network time and IO from the source repository by passing the path to a repository on a faster/local device using --reference
See for yourself
Create a --shared
clone and a --reference
clone. Count the objects in each using git count-objects -v
. You'll notice the shared clone has no objects, and the reference clone has the same number of objects as the source. Further, notice the size difference of each in your file system. If you were to move the source, and test git log
in both shared and reference repositories, the log is unavailable in the shared clone, but works fine in the reference clone.
The link in the comments to your question is really a clearer answer: --reference
implies --shared
. The point of --reference
is to optimise network I/O during the initial clone of a remote repository.
Contrary to the answer above, I find that the --shared
and --reference
repositories -- from the same source -- have the same size and both have zero objects. Of course, if you use --reference
for some other repository which is based off a common source, the size and objects will reflect the difference between the repositories. Note that in both cases we are not saving space in the work tree, only the .git/objects
.
There is some nuance to maintaining this setup going forward - read the thread for more details. Essentially it sounds like the two should be treated as public repositories, with care around history re-writing in the presence of repacking/pruning/garbage collection.
The workflow around maintaining an optimal disk-space usage after the initial clone seems to be:
git gc
in secondaryProbably best to read the discussion in that thread though.
You can add an alternate to an existing repository by putting the absolute path to the source's objects
directory into secondary/.git/objects/info/alternates
and running git gc
(many people use git repack -a -d -l
, which is done by git gc
).
You can remove an alternate by running git repack -a -d
(no -l
) in the secondary and then removing the line from the alternates
file. As described in the thread, it is possible to have more than one alternate.
I've not used this much myself, so I don't know how error-prone it is to manage.
The link in the comments to your question is now dead.
https://www.oreilly.com/library/view/git-pocket-guide/9781449327507/ch06.html has some great information on the subject. Here is some of what is there:
first, we make a bare clone of the remote repository, to be shared locally as a reference repository (hence named “refrep”):
$ git clone --bare http://foo/bar.git refrepThen, we clone the remote again, but this time giving refrep as a reference:
$ git clone --reference refrep http://foo/bar.gitThe key difference between this and the --shared option is that you are still tracking the remote repository, not the refrep clone. When you pull, you still contact http://foo/, but you don’t need to wait for it to send any objects that are already stored locally in refrep; when you push, you are updating the branches and other refs of the foo repository directly.
Of course, as soon as you and others start pushing new commits, the reference repository will become out of date, and you’ll start to lose some of the benefit. Periodically, you can run git fetch --all in refrep to pull in any new objects. A single reference repository can be a cache for the objects of any number of others; just add them as remotes in the reference:
$ git remote add zeus http://olympus/zeus.git
$ git fetch --all zeus