I\'d like to merge a remote git repository in my working git repository as a subdirectory of it. I\'d like the resulting repository to contain the merged history of the two
If you are really wanting to stitch things together, look up grafting. You should also be using git rebase --preserve-merges --onto
. There is also an option to keep the author date for the committer information.
Say you want to merge repository a
into b
(I'm assuming they're located alongside one another):
cd a
git filter-repo --to-subdirectory-filter a
cd ..
cd b
git remote add a ../a
git fetch a
git merge --allow-unrelated-histories a/master
git remote remove a
For this you need git-filter-repo installed (filter-branch
is discouraged).
An example of merging 2 big repositories, putting one of them into a subdirectory: https://gist.github.com/x-yuri/9890ab1079cf4357d6f269d073fd9731
More on it here.
I wanted to
git log -- file
work without --follow
.Step 1: Rewrite history in the source repository to make it look like all files always existed below the subdirectory.
Create a temporary branch for the rewritten history.
git checkout -b tmp_subdir
Then use git filter-branch
as described in How can I rewrite history so that all files, except the ones I already moved, are in a subdirectory?:
git filter-branch --prune-empty --tree-filter '
if [ ! -e foo/bar ]; then
mkdir -p foo/bar
git ls-tree --name-only $GIT_COMMIT | xargs -I files mv files foo/bar
fi'
Step 2: Switch to the target repository. Add the source repository as remote in the target repository and fetch its contents.
git remote add sourcerepo .../path/to/sourcerepo
git fetch sourcerepo
Step 3: Use merge --onto
to add the commits of the rewritten source repository on top of the target repository.
git rebase --preserve-merges --onto master --root sourcerepo/tmp_subdir
You can check the log to see that this really got you what you wanted.
git log --stat
Step 4: After the rebase you’re in “detached HEAD” state. You can fast-forward master to the new head.
git checkout -b tmp_merged
git checkout master
git merge tmp_merged
git branch -d tmp_merged
Step 5: Finally some cleanup: Remove the temporary remote.
git remote rm sourcerepo
After getting the fuller explanation of what is going on, I think I understand it and in any case at the bottom I have a workaround. Specifically, I believe what is happening is rename detection is being fooled by the subtree merge with --prefix. Here is my test case:
mkdir -p z/a z/b
cd z/a
git init
echo A>A
git add A
git commit -m A
echo AA>>A
git commit -a -m AA
cd ../b
git init
echo B>B
git add B
git commit -m B
echo BB>>B
git commit -a -m BB
cd ../a
git remote add -f B ../b
git merge -s ours --no-commit B/master
git read-tree --prefix=bdir -u B/master
git commit -m "subtree merge B into bdir"
cd bdir
echo BBB>>B
git commit -a -m BBB
We make git directories a and b with several commits each. We do a subtree merge, and then we do a final commit in the new subtree.
Running gitk
(in z/a) shows that the history does appear, we can see it. Running git log
shows that the history does appear. However, looking at a specific file has a problem: git log bdir/B
Well, there is a trick we can play. We can look at the pre-rename history of a specific file using --follow. git log --follow -- B
. This is good but isn't great since it fails to link the history of the pre-merge with the post-merge.
I tried playing with -M and -C, but I wasn't able to get it to follow one specific file.
So, the solution, I feel, is to tell git about the rename that will be taking place as part of the subtree merge. Unfortunately git-read-tree is pretty fussy about subtree merges so we have to work through a temporary directory, but that can go away before we commit. Afterwards, we can see the full history.
First, create an "A" repository and make some commits:
mkdir -p z/a z/b
cd z/a
git init
echo A>A
git add A
git commit -m A
echo AA>>A
git commit -a -m AA
Second, create a "B" repository and make some commits:
cd ../b
git init
echo B>B
git add B
git commit -m B
echo BB>>B
git commit -a -m BB
And the trick to making this work: force Git to recognize the rename by creating a subdirectory and moving the contents into it.
mkdir bdir
git mv B bdir
git commit -a -m bdir-rename
Return to repository "A" and fetch and merge the contents of "B":
cd ../a
git remote add -f B ../b
git merge -s ours --no-commit B/master
# According to Alex Brown and pjvandehaar, newer versions of git need --allow-unrelated-histories
# git merge -s ours --allow-unrelated-histories --no-commit B/master
git read-tree --prefix= -u B/master
git commit -m "subtree merge B into bdir"
To show that they're now merged:
cd bdir
echo BBB>>B
git commit -a -m BBB
To prove the full history is preserved in a connected chain:
git log --follow B
We get the history after doing this, but the problem is that if you are actually keeping the old "b" repo around and occasionally merging from it (say it is actually a third party separately maintained repo) you are in trouble since that third party will not have done the rename. You must try to merge new changes into your version of b with the rename and I fear that will not go smoothly. But if b is going away, you win.
git-subtree is a script designed for exactly this use case of merging multiple repositories into one while preserving history (and/or splitting history of subtrees, though that is seems to be irrelevant to this question). It is distributed as part of the git tree since release 1.7.11.
To merge a repository <repo>
at revision <rev>
as subdirectory <prefix>
, use git subtree add
as follows:
git subtree add -P <prefix> <repo> <rev>
git-subtree implements the subtree merge strategy in a more user friendly manner.
The downside is that in the merged history the files are unprefixed (not in a subdirectory). Say you merge repository a
into b
. As a result git log a/f1
will show you all the changes (if any) except those in the merged history. You can do:
git log --follow -- f1
but that won't show the changes other then in the merged history.
In other words, if you don't change a
's files in repository b
, then you need to specify --follow
and an unprefixed path. If you change them in both repositories, then you have 2 commands, none of which shows all the changes.
More on it here.
I found the following solution workable for me. First I go into project B, create a new branch in which already all files will be moved to the new sub directory. I then push this new branch to origin. Next I go to project A, add and fetch the remote of B, then I checkout the moved branch, I go back into master and merge:
# in local copy of project B
git checkout -b prepare_move
mkdir subdir
git mv <files_to_move> subdir/
git commit -m 'move files to subdir'
git push origin prepare_move
# in local copy of project A
git remote add -f B_origin <remote-url>
git checkout -b from_B B_origin/prepare_move
git checkout master
git merge from_B
If I go to sub directory subdir
, I can use git log --follow
and still have the history.
I'm not a git expert, so I cannot comment whether this is a particularly good solution or if it has caveats, but so far it seems all fine.