I\'m looking for a way to set up git respositories that include subsets of files from a larger repository, and inherit the history from that main repository. My primary motivat
As I understand your question
git subtree
or git submodules
One way to extract the history of just a subset of the files into a dedicated branch (which you then can push into a dedicated repository) is using git filter-branch
:
# regex to match the files included in this subproject, used below
file_list_regex='^subproject1/|^shared_file1$|^lib/shared_lib2$'
git checkout -b subproject1 # create new branch from current HEAD
git filter-branch --prune-empty \
--index-filter "git ls-files --cached | grep -v -E '$file_list_regex' | xargs -r git rm --cached" \
HEAD
This will
subproject1
based on the current HEAD
(git checkout -b subproject1
)git filter-branch [...] HEAD
)xargs -r git rm --cached
) that are not part of the subproject (git ls-files --cached | grep -v -E '$file_list_regex'
)--prune-empty
).--index-filter
/--cached
).This is a one-time operation though but as I understand your question you want to continously update the extracted subproject repositories/branches with new commit.
The good news is you could simply repeat this command since git filter-branch
will always produce the same commits/history for your subproject branches - given that you don't manually alter them or rewrite your master branch.
The drawback of this is that this would filter-branch
the complete history each time and for each subproject again and again.
Given that you only want to add the last 5 commits of the master
branch to the tip of your existing subproject1
branch you could adapt the commands like this:
# get the full commit ids for the commits we consider
# to be equivalent in master and subproject1 branch
common_base_commit="$(git rev-parse master~6)"
subproject_tip="$(git rev-parse subproject1)"
# checkout a detached HEAD so we don't change the master branch
git checkout --detach master
git filter-branch --prune-empty \
--index-filter "git ls-files --cached | grep -v -E '$file_list_regex' | xargs -r git rm --cached" \
--parent-filter "sed s/${common_base_commit}/${subproject_tip}/g" \
${common_base_commit}..HEAD
# force reset subproject1 branch to current HEAD
git branch -f subproject1
Explanation:
git filter-branch [...] ${common_base_commit}..HEAD
) up to master~6
which we consider to be the equivalent commit to subproject1
s current tip.master~6
to subproject1
(--parent-filter 'sed s/${common_base_commit}/${subproject_tip}/g'
) effectively rebasing the 5 rewritten commits on top of subproject1
.subproject1
to include the new commits on top of it.Further optimazation/automation:
$file_list_regex
) or actually to exclude (git ls-files --cached | grep -v -E '$file_list_regex'
) from a given subproject$GIT_COMMIT
) or check-in the list to the repository itself in case the files to include per subproject may change over timegit update-project subproject1