I have a big repository which currently contains multiple projects in top level subfolders, say /a
, /b
, /c
, and /d
.
Use
git filter-branch -f --prune-empty --tree-filter 'bash preserve-only.sh a b' -- --all
where preserve-only.sh
is:
IFS=':'
GLOBIGNORE="$*"
rm -rf *
This should remove everything but a
and b
from all commits of all branches, which should be the same as extracting exactly the given directories.
To complete the actual split you can use a filter like rm -rf a b
to get all the changes not extracted in the first run.
Update: While trying to speed things up using --index-filter
I came to an even easier solution:
git filter-branch -f --prune-empty --index-filter \
'git rm --cached -r -q -- . ; git reset -q $GIT_COMMIT -- a b' -- --all
This just removes everything and afterwards restores the given directories afterwards.
After searching around and trying the proposed solutions, it seems like the recommended way of doing is now with git-filter-repo
(see here)
git filter-repo --path a --path b
I'm not aware of any better way than tree-filter
for this. So you already have all the information you need. Now just do it!
Start by creating your two branches:
git branch br1
git branch br2
Now for each branch, check it out, then filter it using the tree-filter
.
You could then split them out to separate directories either by pushing them out, or by cloning or pulling them in.
I prefer this
git filter-branch -f --prune-empty --tree-filter "ls -I a -I b | xargs rm -rf" -- --all