Remove history for everything except a list of files using git filter-branch

后端 未结 3 1319
慢半拍i
慢半拍i 2021-01-06 05:45

I\'m trying to move some files between two git repositories repo1 and repo2. I have a short list of files I\'d like to move (preserving history).

相关标签:
3条回答
  • 2021-01-06 06:11

    You said it leaves behind folders; I assume you mean it leaves behind files in those folders (because git doesn't preserve empty folders)...

    It seems like you might want to take the approach of clearing the index and then re-adding the entries you want.

    git filter-branch ...
        --index-filter 'git rm -r --cached * && git reset $GIT_COMMIT -- libraryname/file1 libraryname/file2 tests/libraryname/file3
        ...
    

    Since you're thinning out the content so much, don't forget that you may want to include a --prune-empty option

    0 讨论(0)
  • 2021-01-06 06:25

    With Git 2.24 (Q4 2019), git filter-branch is deprecated.

    The equivalent would be, using newren/git-filter-repo, and its example section:

    If you have a long list of files, directories, globs, or regular expressions to filter on, you can stick them in a file and use --paths-from-file; for example, with a file named stuff-i-want.txt with contents of

    README.md
    guides/
    tools/releases
    glob:*.py
    regex:^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}.txt$
    tools/==>scripts/
    regex:(.*)/([^/]*)/([^/]*)\.text$==>\2/\1/\3.txt
    

    then you could run

    git filter-repo --paths-from-file stuff-i-want.txt
    

    In your case, stuff-i-want.txt would be:

    libraryname/file1
    libraryname/file2
    tests/libraryname/file3
    

    As kubanczyk points out in the comments:

    Works well on Ubuntu 20.04, you can just pip3 install git-filter-repo since it's stdlib-only and doesn't install any dependencies.

    On Ubuntu 18 it's incompatible with distro's git version, but it's easy to enough to run it on a docker run -ti ubuntu:20.04

    0 讨论(0)
  • 2021-01-06 06:28

    Here is a whitelist-based approach which might be faster (because it only needs to compare whole lines of pre-sorted lists) and easier if a large number of files is involved.

    1. Create a sorted list of all files in all commits of your branch:

      $ export LC_COLLATE=C whitelist="$(mktemp)" && git log --name-status | sed 's/^[A-Z][[:space:]]\{1,\}//; t; d' | sort -u > "$whitelist"

    2. Edit that list with your favorite text editor and remove all files which are not of interest for keeping, i. e. create a white list of files to keep.

      $ "$EDITOR" -- "$whitelist" # remove from list what you don't want to keep

    3. Perform the actual filter operation:

      $ git filter-branch -f --index-filter 'git ls-files -c | sort | comm -23 -- - "$whitelist" | while IFS= read -r f; do git rm --cached -- "$f"; done' --prune-empty

    4. Remove the white list once the filter operation worked without problems.

      $ rm -- "$whitelist" && unset LC_COLLATE whitelist

    0 讨论(0)
提交回复
热议问题