Git: How can I find a commit that most closely matches a directory?

前端 未结 5 603
[愿得一人]
[愿得一人] 2020-12-02 12:58

Someone took a version (unknown to me) of Moodle, applied many changes within a directory, and released it (tree here).

How can I determine which commit of t

相关标签:
5条回答
  • 2020-12-02 13:32

    This was my solution:

    #!/bin/sh
    
    start_date="2012-03-01"
    end_date="2012-06-01"
    needle_ref="aaa"
    
    echo "" > /tmp/script.out;
    shas=$(git log --oneline --all --after="$start_date" --until="$end_date" | cut -d' ' -f 1)
    for sha in $shas
    do
        wc=$(git diff --name-only "$needle_ref" "$sha" | wc -l)
        wc=$(printf %04d $wc);
        echo "$wc $sha" >> /tmp/script.out
    done
    cat /tmp/script.out | grep -v ^$ | sort | head -5
    
    0 讨论(0)
  • 2020-12-02 13:35

    you could write a script, which diffs the given tree against a revision range in your repository.

    assume we first fetch the changed tree (without history) into our own repository:

    git remote add foreign git://…
    git fetch foreign
    

    we then output the diffstat (in short form) for each revision we want to match against:

    for REV in $(git rev-list 1.8^..1.9); do
       git diff --shortstat foreign/master $REV;
    done
    

    look for the commit with the smallest amount of changes (or use some sorting mechanism)

    0 讨论(0)
  • 2020-12-02 13:39

    How about using git to create a patch from all versions of 1.8. and 1.9 to this new release. Then you could see which patch makes more 'sense'.

    For example, if the patch 'removes' many methods, then it is probably not this release, but one before. If the patch has many sections that don't make sense as a single edit, then it probably isn't this release either.

    And so on... In reality, unfortunately, there doesn't exist an algorithm to do this perfectly. I will have to be some heuristic.

    0 讨论(0)
  • 2020-12-02 13:48

    How about using 'git blame'? It will show you, for each line, who changed it, and in which revision.

    0 讨论(0)
  • 2020-12-02 13:49

    Some really great solutions here!

    I used something similar, to try and find the closet source file revision (given a target file):

    1. iterate backwards through all commits in the branch merge
    2. looking for the closest match with file target.txt
    3. print out the git revision, and the number of differing lines of text

    N.B. perform inside a new, throw-away branch - reset --hard is destructive (afaik).

    for REV in $(git rev-list merge); do
        git reset --hard "$REV"
        echo "$REV" `comm -2 -3 source.txt ../target.txt | wc -l`
    done
    

    You'll get output like the following, which tells you which revision was the closest match (i.e. least differing lines):

    1c58bd5925a1fc8233730626**************** 771
    HEAD is now at ...
    9b2c29b00f1b4541a4135906**************** 775
    HEAD is now at ...
    b8e0bf5ec4372ebbcbd4edd0**************** 342
    HEAD is now at ...
    ba0d474bf2aac40dae48923e**************** 342
    HEAD is now at ...
    6d96921d3e9ad760ce55e76c**************** 335 <-- Closest match
    HEAD is now at ...
    795cd4caae5a5b08563443c9**************** 396
    HEAD is now at ...
    8743f42b24dd77e3bcc897dd**************** 399
    HEAD is now at ...
    d1b74dd33074c17da3fff638**************** 929
    

    Further reading:

    • comm - for outputing differing lines
    • wc - for counting lines of text

    Credit:

    • https://stackoverflow.com/a/4546712/782034
    0 讨论(0)
提交回复
热议问题