How to make 'git diff' ignore comments

前端 未结 6 1403
闹比i
闹比i 2020-12-05 02:38

I am trying to produce a list of the files that were changed in a specific commit. The problem is, that every file has the version number in a comment at the top of the file

相关标签:
6条回答
  • 2020-12-05 02:57

    I found it easiest to use git difftool to launch an external diff tool:

    git difftool -y -x "diff -I '<regex>'"
    
    0 讨论(0)
  • 2020-12-05 02:57

    I found a solution. I can use this command:

    git diff --numstat --minimal <commit> <commit> | sed '/^[1-]\s\+[1-]\s\+.*/d'
    

    To show the files that have more than one line changed between commits, which eliminates files whose only change was the version number in the comments.

    0 讨论(0)
  • 2020-12-05 02:58

    Using 'grep' on the 'git diff' output,

    git diff -w | grep -c -E "(^[+-]\s*(\/)?\*)|(^[+-]\s*\/\/)"
    

    comment line changes alone can be calculated. (A)

    Using 'git diff --stat' output,

    git diff -w --stat
    

    all line changes can be calculated. (B)

    To get non comment source line changes (NCSL) count, subtract (A) from (B).

    Explanation:

    In the 'git diff ' output (in which whitespace changes are ignored),

    • Look out for a line which start with either '+' or '-', which means modified line.
    • There can be optional white-space characters following this. '\s*'
    • Then look for comment line pattern '/*' (or) just '*' (or) '//'.
    • Since, '-c' option is given with grep, just print the count. Remove '-c' option to see the comments alone in the diffs.

    NOTE: There can be minor errors in the comment line count due to following assumptions, and the result should be taken as a ballpark figure.

    • 1.) Source files are based on the C language. Makefile and shell script files have a different convention, '#', to denote the comment lines and if they are part of diffset, their comment lines won't be counted.

    • 2.) The Git convention of line change: If a line is modified, Git sees it as that particular line is deleted and a new line is inserted there and it may look like two lines are changed whereas in reality one line is modified.

       In the below example, the new definition of 'FOO' looks like a two-line change.
      
       $  git diff --stat -w abc.h
       ...
       -#define FOO 7
       +#define FOO 105
       ...
       1 files changed, 1 insertions(+), 1 deletions(-)
       $
      
    • 3.) Valid comment lines not matching the pattern (or) Valid source code lines matching the pattern can cause errors in the calculation.

    In the below example, the "+ blah blah" line which doesn't start with '*' won't be detected as a comment line.

               + /*
               +  blah blah
               + *
               + */
    

    In the below example, the "+ *ptr" line will be counted as a comment line as it starts with *, though it is a valid source code line.

                + printf("\n %p",
                +         *ptr);
    
    0 讨论(0)
  • 2020-12-05 03:11
    git diff -G <regex>
    

    And specify a regular expression that does not match your version number line.

    0 讨论(0)
  • 2020-12-05 03:12

    Here is a solution that is working well for me. I've written up the solution and some additional missing documentation on the git (log|diff) -G<regex> option.

    It is basically using the same solution as in previous answers, but specifically for comments that start with a * or a #, and sometimes a space before the *... But it still needs to allow #ifdef, #include, etc. changes.

    Look ahead and look behind do not seem to be supported by the -G option, nor does the ? in general, and I have had problems with using *, too. + seems to be working well, though.

    (Note, tested on Git v2.7.0)

    Multi-Line Comment Version

    git diff -w -G'(^[^\*# /])|(^#\w)|(^\s+[^\*#/])'
    
    • -w ignore whitespace
    • -G only show diff lines that match the following regex
    • (^[^\*# /]) any line that does not start with a star or a hash or a space
    • (^#\w) any line that starts with # followed by a letter
    • (^\s+[^\*#/]) any line that starts with some whitespace followed by a comment character

    Basically an SVN hook modifies every file in and out right now and modifies multi-line comment blocks on every file. Now I can diff my changes against SVN without the FYI information that SVN drops in the comments.

    Technically this will allow for Python and Bash comments like #TODO to be shown in the diff, and if a division operator started on a new line in C++ it could be ignored:

    a = b
        / c;
    

    Also the documentation on -G in Git seemed pretty lacking, so the information here should help:

    git diff -G<regex>

    -G<regex>

    Look for differences whose patch text contains added/removed lines that match <regex>.

    To illustrate the difference between -S<regex> --pickaxe-regex and -G<regex>, consider a commit with the following diff in the same file:

    +    return !regexec(regexp, two->ptr, 1, &regmatch, 0);
    ...
    -    hit = !regexec(regexp, mf2.ptr, 1, &regmatch, 0);
    

    While git log -G"regexec\(regexp" will show this commit, git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).

    See the pickaxe entry in gitdiffcore(7) for more information.

    (Note, tested on Git v2.7.0)

    • -G uses a basic regular expression.
    • No support for ?, *, !, {, } regular expression syntax.
    • Grouping with () and OR-ing groups works with |.
    • Wild card characters such as \s, \W, etc. are supported.
    • Look-ahead and look-behind are not supported.
    • Beginning and ending line anchors ^$ work.
    • Feature has been available since Git 1.7.4.

    Excluded Files v Excluded Diffs

    Note that the -G option filters the files that will be diffed.

    But if a file gets "diffed" those lines that were "excluded/included" before will all be shown in the diff.

    Examples

    Only show file differences with at least one line that mentions foo.

    git diff -G'foo'
    

    Show file differences for everything except lines that start with a #

    git diff -G'^[^#]'
    

    Show files that have differences mentioning FIXME or TODO

    git diff -G`(FIXME)|(TODO)`
    

    See also git log -G, git grep, git log -S, --pickaxe-regex, and --pickaxe-all

    UPDATE: Which regular expression tool is in use by the -G option?

    https://github.com/git/git/search?utf8=%E2%9C%93&q=regcomp&type=

    https://github.com/git/git/blob/master/diffcore-pickaxe.c

    if (opts & (DIFF_PICKAXE_REGEX | DIFF_PICKAXE_KIND_G)) {
        int cflags = REG_EXTENDED | REG_NEWLINE;
        if (DIFF_OPT_TST(o, PICKAXE_IGNORE_CASE))
            cflags |= REG_ICASE;
        regcomp_or_die(&regex, needle, cflags);
        regexp = &regex;
    
    // and in the regcom_or_die function
    regcomp(regex, needle, cflags);
    

    http://man7.org/linux/man-pages/man3/regexec.3.html

       REG_EXTENDED
              Use POSIX Extended Regular Expression syntax when interpreting
              regex.  If not set, POSIX Basic Regular Expression syntax is
              used.
    

    // ...

       REG_NEWLINE
              Match-any-character operators don't match a newline.
    
              A nonmatching list ([^...])  not containing a newline does not
              match a newline.
    
              Match-beginning-of-line operator (^) matches the empty string
              immediately after a newline, regardless of whether eflags, the
              execution flags of regexec(), contains REG_NOTBOL.
    
              Match-end-of-line operator ($) matches the empty string
              immediately before a newline, regardless of whether eflags
              contains REG_NOTEOL.
    
    0 讨论(0)
  • 2020-12-05 03:17

    Perhaps a Bash script like this:

    #!/bin/bash
    git diff --name-only "$@" | while read FPATH ; do
        LINES_COUNT=`git diff --textconv "$FPATH" "$@" | sed '/^[1-]\s\+[1-]\s\+.*/d' | wc -l`
        if [ $LINES_COUNT -gt 0 ] ; then
            echo -e "$LINES_COUNT\t$FPATH"
        fi
    done | sort -n
    
    0 讨论(0)
提交回复
热议问题