Rewrite history git filter-branch create / split into submodules / subprojects

前端 未结 4 1019
花落未央
花落未央 2020-12-06 03:54

I am currently importing a cvs project into git.
After importing, i want to rewrite the history to move an existing directory into a seperate submodule.

Suppose

相关标签:
4条回答
  • 2020-12-06 04:00

    I have a project with a utils library that's started to be useful in other projects, and wanted to split its history off into a submodules. Didn't think to look on SO first so I wrote my own, it builds the history locally so it's a good bit faster, after which if you want you can set up the helper command's .gitmodules file and such, and push the submodule histories themselves anywhere you want.

    The stripped command itself is here, the doc's in the comments, in the unstripped one that follows. Run it as its own command, with subdir set, like subdir=utils git split-submodule if you're splitting the utils directory. It's hacky because it's a one-off, but I tested it on the Documentation subdirectory in the Git history.

    #!/bin/bash
    # put this or the commented version below in e.g. ~/bin/git-split-submodule
    ${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
    ${debug+set -x}
    fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
    pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
        | git cat-file --batch-check='%(objectname)' | uniq`)
    [[ $pathcheck = *:* ]] || {
        subfam=($( set -- ${fam[@]}; shift;
            for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
                git rev-parse -q --verify $tpar:"$subdir"
            done
        ))
        git rm -rq --cached --ignore-unmatch  "$subdir"
        if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
            git update-index --add --cacheinfo 160000,$subfam,"$subdir"
        else
            subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
                | git commit-tree $GIT_COMMIT:"$subdir" $(
                    ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
                ` &&
            git update-index --add --cacheinfo 160000,$subnew,"$subdir"
        fi
    }
    ${debug+set +x}
    

    #!/bin/bash
    # Git filter-branch to split a subdirectory into a submodule history.
    
    # In each commit, the subdirectory tree is replaced in the index with an
    # appropriate submodule commit.
    # * If the subdirectory tree has changed from any parent, or there are
    #   no parents, a new submodule commit is made for the subdirectory (with
    #   the current commit's message, which should presumably say something
    #   about the change). The new submodule commit's parents are the
    #   submodule commits in any rewrites of the current commit's parents.
    # * Otherwise, the submodule commit is copied from a parent.
    
    # Since the new history includes references to the new submodule
    # history, the new submodule history isn't dangling, it's incorporated.
    # Branches for any part of it can be made casually and pushed into any
    # other repo as desired, so hooking up the `git submodule` helper
    # command's conveniences is easy, e.g.
    #     subdir=utils git split-submodule master
    #     git branch utils $(git rev-parse master:utils)
    #     git clone -sb utils . ../utilsrepo
    # and you can then submodule add from there in other repos, but really,
    # for small utility libraries and such, just fetching the submodule
    # histories into your own repo is easiest. Setup on cloning a
    # project using "incorporated" submodules like this is:
    #   setup:  utils/.git
    #
    #   utils/.git:
    #       @if _=`git rev-parse -q --verify utils`; then \
    #           git config submodule.utils.active true \
    #           && git config submodule.utils.url "`pwd -P`" \
    #           && git clone -s . utils -nb utils \
    #           && git submodule absorbgitdirs utils \
    #           && git -C utils checkout $$(git rev-parse :utils); \
    #       fi
    # with `git config -f .gitmodules submodule.utils.path utils` and
    # `git config -f .gitmodules submodule.utils.url ./`; cloners don't
    # have to do anything but `make setup`, and `setup` should be a prereq
    # on most things anyway.
    
    # You can test that a commit and its rewrite put the same tree in the
    # same place with this function:
    # testit ()
    # {
    #     tree=($(git rev-parse `git rev-parse $1`: refs/original/refs/heads/$1));
    #     echo $tree `test $tree != ${tree[1]} && echo ${tree[1]}`
    # }
    # so e.g. `testit make~95^2:t` will print the `t` tree there and if
    # the `t` tree at ~95^2 from the original differs it'll print that too.
    
    # To run it, say `subdir=path/to/it git split-submodule` with whatever
    # filter-branch args you want.
    
    # $GIT_COMMIT is set if we're already in filter-branch, if not, get there:
    ${GIT_COMMIT-exec git filter-branch --index-filter "subdir=$subdir; ${debug+debug=$debug;} $(sed 1,/SNIP/d "$0")" "$@"}
    
    ${debug+set -x}
    fam=(`git rev-list --no-walk --parents $GIT_COMMIT`)
    pathcheck=(`printf "%s:$subdir\\n" ${fam[@]} \
        | git cat-file --batch-check='%(objectname)' | uniq`)
    
    [[ $pathcheck = *:* ]] || {
        subfam=($( set -- ${fam[@]}; shift;
            for par; do tpar=`map $par`; [[ $tpar != $par ]] &&
                git rev-parse -q --verify $tpar:"$subdir"
            done
        ))
    
        git rm -rq --cached --ignore-unmatch  "$subdir"
        if (( ${#pathcheck[@]} == 1 && ${#fam[@]} > 1 && ${#subfam[@]} > 0)); then
            # one id same for all entries, copy mapped mom's submod commit
            git update-index --add --cacheinfo 160000,$subfam,"$subdir"
        else
            # no mapped parents or something changed somewhere, make new
            # submod commit for current subdir content.  The new submod
            # commit has all mapped parents' submodule commits as parents:
            subnew=`git cat-file -p $GIT_COMMIT | sed 1,/^$/d \
                | git commit-tree $GIT_COMMIT:"$subdir" $(
                    ${subfam:+printf ' -p %s' ${subfam[@]}}) 2>&-
                ` &&
            git update-index --add --cacheinfo 160000,$subnew,"$subdir"
        fi
    }
    ${debug+set +x}
    
    0 讨论(0)
  • 2020-12-06 04:00

    Note: the submodule entry is only created when you do, from the parent repo a

    git submodule init
    git submodule update
    

    You don't need those commands in your rewrite-submodule-tree-filter script, since it is only about setting correctly the .gitmodules file content.

    You would execute those "git submodule" commands only when you are using the parent repo for the first time: see "Cloning a Project with Submodules".

    0 讨论(0)
  • 2020-12-06 04:05

    I resolved my own question, here is the solution:

    git-submodule-split library another_library

    Script git-submodule-split:

        #!/bin/bash
    
        set -eu
    
        if [ $# -eq 0 ]
        then
            echo "Usage: $0 submodules-to-split"
        fi
    
        export _tmp=$(mktemp -d)
        export _libs="$@"
        for i in $_libs
        do
            mkdir -p $_tmp/$i
        done
    
        git filter-branch --commit-filter '
        function gitCommit()
        {
            git add -A
            if [ -n "$(git diff --cached --name-only)" ]
            then
                git commit -F $_msg
            fi
        } >/dev/null
    
        # from git-filter-branch
        git checkout-index -f -u -a || die "Could not checkout the index"
        # files that $commit removed are now still in the working tree;
        # remove them, else they would be added again
        git clean -d -q -f -x
    
        _git_dir=$GIT_DIR
        _git_work_tree=$GIT_WORK_TREE
        _git_index_file=$GIT_INDEX_FILE
        unset GIT_DIR
        unset GIT_WORK_TREE
        unset GIT_INDEX_FILE
    
        _msg=$(tempfile)
        cat /dev/stdin > $_msg
        for i in $_libs
        do
            if [ -d "$i" ]
            then
                unset GIT_DIR
                unset GIT_WORK_TREE
                unset GIT_INDEX_FILE
                cd $i
                if [ -d ".git" ]
                then
                    gitCommit
                else
                    git init >/dev/null
                    gitCommit
                fi
                cd ..
                rsync -a -rtu $i/.git/ $_tmp/$i/.git/
                export GIT_DIR=$_git_dir
                export GIT_WORK_TREE=$_git_work_tree
                export GIT_INDEX_FILE=$_git_index_file
                git rm -q -r --cached $i
                git submodule add ./$i >/dev/null
                git add $i
            fi
        done
        rm $_msg
        export GIT_DIR=$_git_dir
        export GIT_WORK_TREE=$_git_work_tree
        export GIT_INDEX_FILE=$_git_index_file
    
        if [ -f ".gitmodules" ]
        then
            git add .gitmodules
        fi
    
        _new_rev=$(git write-tree)
        shift
        git commit-tree "$_new_rev" "$@";
        ' --tag-name-filter cat -- --all
    
        for i in $_libs
        do
            if [ -d "$_tmp/$i/.git" ]
            then
                rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
                cd $i
                git reset --hard
                cd ..
            fi
        done
        rm -r $_tmp
    
        git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done
    
        git reflog expire --expire=now --all
        git gc --aggressive --prune=now
    
        
    0 讨论(0)
  • 2020-12-06 04:05

    Here is an updated answer that works for me on MacOSX. The major change is the use of pushd/popd to change directories, so that a submodule can be something like module/glop and not just glop.

    #!/bin/bash
    
    set -eu
    
    if [ $# -eq 0 ]
    then
        echo "Usage: $0 submodules-to-split"
    fi
    
    export _tmp=$(mktemp -d /tmp/git-submodule-split.XXXXXX)
    export _libs="$@"
    for i in $_libs
    do
        mkdir -p $_tmp/$i
    done
    
    git filter-branch --commit-filter '
    function gitCommit()
    {
        git add -A
        if [ -n "$(git diff --cached --name-only)" ]
        then
            git commit -F $_msg
        fi
    } >/dev/null
    
    # from git-filter-branch
    git checkout-index -f -u -a || die "Could not checkout the index"
    # files that $commit removed are now still in the working tree;
    # remove them, else they would be added again
    git clean -d -q -f -x >&2
    
    _git_dir=$GIT_DIR
    _git_work_tree=$GIT_WORK_TREE
    _git_index_file=$GIT_INDEX_FILE
    unset GIT_DIR
    unset GIT_WORK_TREE
    unset GIT_INDEX_FILE
    
    _msg=$(mktemp /tmp/git-submodule-split-msg.XXXXXX)
    cat /dev/stdin > $_msg
    for i in $_libs
    do
        if [ -d "$i" ]
        then
            unset GIT_DIR
            unset GIT_WORK_TREE
            unset GIT_INDEX_FILE
            pushd $i > /dev/null
            if [ -d ".git" ]
            then
                gitCommit
            else
                git init >/dev/null
                gitCommit
            fi
            popd > /dev/null
            mkdir -p $_tmp/$i
            rsync -a -rtu $i/.git/ $_tmp/$i/.git/
            export GIT_DIR=$_git_dir
            export GIT_WORK_TREE=$_git_work_tree
            export GIT_INDEX_FILE=$_git_index_file
            git rm -q -r --cached $i >&2
            git submodule add ./$i $i >&2
            git add $i >&2
        fi
    done
    export GIT_DIR=$_git_dir
    export GIT_WORK_TREE=$_git_work_tree
    export GIT_INDEX_FILE=$_git_index_file
    
    if [ -f ".gitmodules" ]
    then
        git add .gitmodules >&2
    fi
    
    _new_rev=$(git write-tree)
    shift
    git commit-tree -F $_msg "$_new_rev" $@;
    rm -f $_msg
    ' --tag-name-filter cat -- --all
    
    for i in $_libs
    do
        if [ -d "$_tmp/$i/.git" ]
        then
            rsync -a -i -rtu $_tmp/$i/.git/ $i/.git/
            pushd $i
            git reset --hard
            popd
        fi
    done
    rm -rf $_tmp
    
    git for-each-ref refs/original --format="%(refname)" | while read i; do git update-ref -d $i; done
    
    git reflog expire --expire=now --all
    git gc --aggressive --prune=now
    
    0 讨论(0)
提交回复
热议问题