Show number of changed lines per author in git

后端 未结 6 913
灰色年华
灰色年华 2020-12-13 02:48

i want to see the number of removed/added line, grouped by author for a given branch in git history. there is git shortlog -s which shows me the number of commi

相关标签:
6条回答
  • 2020-12-13 03:01

    This script here will do it. Put it into authorship.sh, chmod +x it, and you're all set.

    #!/bin/sh
    declare -A map
    while read line; do
        if grep "^[a-zA-Z]" <<< "$line" > /dev/null; then
            current="$line"
            if [ -z "${map[$current]}" ]; then 
                map[$current]=0
            fi
        elif grep "^[0-9]" <<<"$line" >/dev/null; then
            for i in $(cut -f 1,2 <<< "$line"); do
                map[$current]=$((map[$current] + $i))
            done
        fi
    done <<< "$(git log --numstat --pretty="%aN")"
    
    for i in "${!map[@]}"; do
        echo -e "$i:${map[$i]}"
    done | sort -nr -t ":" -k 2 | column -t -s ":"
    
    0 讨论(0)
  • 2020-12-13 03:07

    From How to count total lines changed by a specific author in a Git repository?

    The output of the following command should be reasonably easy to send to script to add up the totals:

    git log --author="<authorname>" --oneline --shortstat
    

    This gives stats for all commits on the current HEAD. If you want to add up stats in other branches you will have to supply them as arguments to git log.

    0 讨论(0)
  • 2020-12-13 03:10

    It's an old post but if someone is still looking for it:

    install git extras

    brew install git-extras
    

    then

    git summary --line
    

    https://github.com/tj/git-extras

    0 讨论(0)
  • 2020-12-13 03:12

    one line code(support time range selection):

    git log --since=4.weeks --numstat --pretty="%ae %H" | sed 's/@.*//g' | awk '{ if (NF == 1){ name = $1}; if(NF == 3) {plus[name] += $1; minus[name] += $2}} END { for (name in plus) {print name": +"plus[name]" -"minus[name]}}' | sort -k2 -gr
    

    explain:

    git log --since=4.weeks --numstat --pretty="%ae %H" \
        | sed 's/@.*//g'  \
        | awk '{ if (NF == 1){ name = $1}; if(NF == 3) {plus[name] += $1; minus[name] += $2}} END { for (name in plus) {print name": +"plus[name]" -"minus[name]}}' \
        | sort -k2 -gr
    
    # query log by time range
    # get author email prefix
    # count plus / minus lines
    # sort result
    

    output:

    user-a: +5455 -3471
    user-b: +5118 -1934
    
    0 讨论(0)
  • 2020-12-13 03:12

    On my repos I've gotten a lot of trash output from the one-liners floating around, so here is a Python script to do it right:

    import subprocess
    import collections
    import sys
    
    
    def get_lines_from_call(command):
        return subprocess.check_output(command).splitlines()
    
    def get_files(paths=()):
        command = ['git', 'ls-files']
        command.extend(paths)
        return get_lines_from_call(command)
    
    def get_blame(path):
        return get_lines_from_call(['git', 'blame', path])
    
    
    def extract_name(line):
        """
        Extract the author from a line of a standard git blame
        """
        return line.split('(', 1)[1].split(')', 1)[0].rsplit(None, 4)[0]
    
    
    def get_file_authors(path):
        return [extract_name(line) for line in get_blame(path)]
    
    
    def blame_stats(paths=()):
        counter = collections.Counter()
        for filename in get_files(paths):
            counter.update(get_file_authors(filename))
        return counter
    
    
    def main():
        counter = blame_stats(sys.argv[1:])
        max_width = len(str(counter.most_common(1)[0][1]))
        for name, count in reversed(counter.most_common()):
            print('%s %s' % (str(count).rjust(max_width), name))
    
    if __name__ == '__main__':
        main()
    

    Note that the arguments to the script will be passed to git ls-files, so if you only want to show Python files: blame_stats.py '**/*.py'

    If you only want to show files in one subdirectory:blame_stats.py some_dir

    And so on.

    0 讨论(0)
  • 2020-12-13 03:24

    Since the SO question "How to count total lines changed by a specific author in a Git repository?" is not completely satisfactory, commandlinefu has alternatives (albeit not per branch):

    git ls-files | while read i; do git blame $i | sed -e 's/^[^(]*(//' -e 's/^\([^[:digit:]]*\)[[:space:]]\+[[:digit:]].*/\1/'; done | sort | uniq -ic | sort -nr
    

    It includes binary files, which is not good, so you could (to remove really random binary files):

    git ls-files | grep -v "\.\(pdf\|psd\|tif\)$"
    

    (Note: as commented by trcarden, a -x or --exclude option wouldn't work.
    From git ls-files man page, git ls-files -x "*pdf" ... would only excluded untracked content, if --others or --ignored were added to the git ls-files command.)

    Or:

    git ls-files "*.py" "*.html" "*.css" 
    

    to only include specific file types.


    Still, a "git log"-based solution should be better, like:

    git log --numstat --pretty="%H" --author="Your Name" commit1..commit2 | awk 'NF==3 {plus+=$1; minus+=$2} END {printf("+%d, -%d\n", plus, minus)}'
    

    but again, this is for one path (here 2 commits), not for all branches per branches.

    0 讨论(0)
提交回复
热议问题