The original file that was split in 2 other files, is there a way in git to see what went where?

前端 未结 1 2065
清酒与你
清酒与你 2021-02-13 19:19

My problem:

I am a code reviewer, I have a situation in GIT:

  • before: a.txt

Then a developer decided to split the content of

1条回答
  •  时光说笑
    2021-02-13 19:26

    Is there an easy way to see:

    • what came to b from a?
    • what came to c from a?
    • all extra changes apart from just moving stuff?

    I don't think there's really any way to extract this information other than visually inspecting the diff. However, it looks like we may be able to detect a split files using git diff along with the -C argument. For example, I start with a file that contains 38 lines, and move 24 into one file and 14 into another (and delete the original). git diff --name-status just tells me that I have renamed one file and added another:

    R060    lorem.txt       fileA
    A       fileB
    

    But if we modify our command line to detect copies:

    git diff --name-status -C30 HEAD^
    

    We get:

    C060    lorem.txt       fileA
    R039    lorem.txt       fileB
    

    The -C30 argument says "consider a file a copy if it is at least 30% similar to another file included in the commit". Note that there is a corresponding -M option that controls rename detection; it defaults to 50%.

    A certain policy/workflow that prevents from problem like this would also help.

    What exactly are you trying to prevent? There's not really anyway to distinguish "I split a file into two new files" from "I deleted a file and created two new files".

    You could in theory prevent commits that both introduce new files and modify existing files. That would be relatively easy with a pre-receive hook, for example. But that's such a common situation, I'm not sure you'd want to do this in practice.

    For the above, a pre-receive hook like the following might work:

    #!/bin/bash                                                                      
    
    while read old new ref; do
            while read type name; do
                    if [ "$type" = "A" ]; then
                            has_new=1
                    else
                            has_mod=1
                    fi
            done < <(git show --name-status --format='' $new)
    done
    
    if [ "$has_new" = 1 -a "$has_mod" = 1 ]; then
            echo "ERROR: commits may not both create and modify files" >&2
            exit 1
    fi
    
    exit 0
    

    We could alternatively use our "split detection", discussed earlier, and implement something like:

    #!/bin/bash
    
    while read old new ref; do
        git diff --name-status -C30 $old $new |
            awk '
                {total[$2]++}
                END {for (i in total) if (total[i] > 1) exit 1}
            '
    
        if [ $? -ne 0 ]; then
            echo "ERROR: detected a split file"
            exit 1
        fi
    done
    
    exit 0
    

    This will exit with an error if any file shows up as the "old name" for a file more than once. Trying to push to a repository using this pre-receive hook, using the example given in the first part of this answers, get me:

    $ git push
    Counting objects: 5, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (4/4), done.
    Writing objects: 100% (5/5), 1.46 KiB | 1.46 MiB/s, done.
    Total 5 (delta 0), reused 0 (delta 0)
    remote: ERROR: detected a split file
    To upstream
     ! [remote rejected] master -> master (pre-receive hook declined)
    

    Maybe that helps? Without extensive testing I would worry about false positives with this solution.

    0 讨论(0)
提交回复
热议问题