Git, find out which files have had the most commits

前端 未结 4 356
耶瑟儿~
耶瑟儿~ 2021-01-30 14:29

How can I search my git logs to see which files have had the most activity?

相关标签:
4条回答
  • 2021-01-30 15:02

    Assuming the range of revisions you want to select is <range>, the command:

    git log --format=%n --name-only <range>|sort|uniq -c|tail -n +2
    

    will output for each file of your repository the number of occurences in commit diffs, ie number of changes, including file creation as a change. Keep <range> empty to get statistics from initial commit to your branch HEAD.

    0 讨论(0)
  • 2021-01-30 15:03

    Here's a python script that you can pipe the log --numstat output through to get the results:

    import sys, re
    
    res = {}
    
    while 1:
        line = sys.stdin.readline()
        if len(line) == 0:
            break;
        m =  re.match("([0-9]+)[ \t]+([0-9]+)[ \t]+(.*)", line)
        if m != None:
            f = m.group(3)
            if f not in res: res[f] = {'add':0, 'rem':0, 'commits':0} 
            res[f]['commits'] += 1
            res[f]['add'] += int(m.group(1))
            res[f]['rem'] += int(m.group(2))
    
    for f in res:
        r = res[f]
        print "%s %s %s %s"%(r['commits'], r['add'], r['rem'], f)
    

    You can modify it as needed to sort/filter how you want.

    0 讨论(0)
  • 2021-01-30 15:08

    uses git effort [--above <value>] (from git-extras package) to list all files and the number of commit concerned.

    You can restrict to a path

    0 讨论(0)
  • 2021-01-30 15:13

    that's one of these things that is very easy, accidentally (?):

    git rev-list --objects --all | awk '$2' | sort -k2 | uniq -cf1 | sort -rn | head
    
    1. give me all objects from all revisions in all branches
    2. ignore any results without a path
    3. sort them by path
    4. make them unique (ignoring the blob hash), prefix lines with duplication count
    5. sort descending on duplication count
    6. show topmost lines

    Output similar to

       1058 fffcba193374a85fd6a3490f800c6901218a950b src
        715 ffffe0f08798e95b66cc4ad4ff22cf10734d045e src/lib
        450 ffcfe596031a5985664e35937fff4ac9ff38dcca src/zfs-fuse
        367 ffc5d5340f95360fc9f7b739c5593dd3f92fced0 src/lib/libzpool
        202 ff92db000792044d45eec21c57a3cd21618631e7 src/lib/libsolkerncompat
        183 ff1a44edae3fd121ffffd86864b589e5ab2f9ff99b src/lib/libzfscommon
        178 fec6b3a789e578983c2242b3aa5adf217cb8b887 src/lib/libzfs
        168 ffeefc9e81222d7c471bdb0911d8b98f23cff050 src/cmd
        167 fbd60bd3430765863648c52db7ceb3ffa15d5e50 src/lib/libzfscommon/include
        155 ff225f6b41f9557d683079c5f9276f497bcb06bd src/lib/libzfscommon/include/sys
    

    You can take it from here.

    E.g. if you wanted to see only file blobs:

    git rev-list --objects --all | awk '$2' | sort -k2 | uniq -cf1 | sort -rn |
        while read frequency sample file
        do 
           [ "blob" == "$(git cat-file -t $sample)" ] && echo -e "$frequency\t$file";
        done
    

    output:

    135 src/zfs-fuse/zfs_operations.c
    84  src/zfs-fuse/zfs_ioctl.c
    79  src/zfs-fuse/zfs_vnops.c
    73  src/lib/libzfs/libzfs_dataset.c
    67  src/lib/libzpool/spa.c
    66  src/zfs-fuse/zfs_vfsops.c
    62  src/cmd/zdb/zdb.c
    62  CHANGES
    60  src/cmd/ztest/ztest.c
    60  src/lib/libzpool/arc.c
    

    You wanted to see only specifc range of revisions

    You can have a ball with the rev-list part:

    git rev-list --after=2011-01-01 --until='two weeks ago' \
         tag1...remote/hotfix ^master
    

    Will use only revisions in the specified date range, that are in the symmetric set difference for tag1 and remote/hotfix and are not in master

    0 讨论(0)
提交回复
热议问题