How to see the file size history of a single file in a git repository?

前端 未结 8 1564
遇见更好的自我
遇见更好的自我 2020-12-03 06:52

Is there anyway to see how a file\'s size has changed through time in a git repository? I want to see how my main.js file (which is the combination of several files and mini

相关标签:
8条回答
  • 2020-12-03 07:16

    In case this is of use for someone, this script will show the size of a given file in different commits:

    git log <file_name> | grep "^commit" | cut -f2 -d" " | while read hash; do
       echo -n "$hash -- "
       git show $hash:<file_path_off_of_git_root_without_leading_slash> | wc -c
    done
    
    0 讨论(0)
  • 2020-12-03 07:25

    While commands like git log <filename>, git whatchanged, etc. can show the history pertaining to that file, I don't see anywhere in either the built-in or custom pretty formats an option that shows size (sadly, the --log-size option is only for the log messages!).

    However, you can get a rough idea of the size by seeing the total number of lines added and removed in each commit. You can sort of visualize it with the command git log --stat <filename>, which uses plus and minus signs. Or use git log --numstat <filename> to collect the number of lines added or removed in each commit and use the numbers in some other visualization.

    0 讨论(0)
  • 2020-12-03 07:26

    Create a file called .gitattributes and add the following line:

    main.js -diff
    

    This turns off line-based diffs for main.js. Now run the following command:

    git log --stat main.js
    

    The log will include lines like

    main.js | Bin 4316 -> 4360 bytes
    

    After you're done, you should probably delete .gitattributes. I don't know what other changes in git's behavior may be caused by the -diff attribute.

    Tested with git versions 1.7.12.4 and 1.7.9.5.

    Source: ewall's answer and https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html#_marking_files_as_binary

    0 讨论(0)
  • 2020-12-03 07:27

    You could create a script that uses the output from git show --pretty=raw <commit> to obtain the tree, then uses git ls-tree -r -l to obtain the blob you are looking for, including the file size.

    In case you have ruby and the grit gem installed, here's a little script I threw together:

    require 'grit'
    
    if ARGV.size < 1
      puts 'usage: file-size FILE'
      puts 'run from within the git repo root'
      exit
    end
    
    filename = ARGV[0].to_s
    
    repo = Grit::Repo.new('.')
    commits = repo.log('master', filename)
    commits.each do |commit|
      blob = commit.tree/filename
      puts "#{commit} #{blob.size} bytes"
    end
    

    Example usage (filename of script is file-size.rb), will show you the history for somedir/somefile:

    myproject$ ruby file-size.rb somedir/somefile
    
    0 讨论(0)
  • 2020-12-03 07:27

    You can use either git ls-tree -r -l <revision> <path> to get the blob size at given revision, e.g.

    $ git ls-tree -r -l v1.6.0 gitweb/README
    100644 blob 825162a0b6dce8c354de67a30abfbad94d29fdde   16067    gitweb/README
    

    The blob size in this example is '16067'. The disadvantage of this solution is that git ls-tree can process only one revision at once.

    You can use instead git cat-file --batch-check < <list-of-objects> instead, feeding it blob identifiers. If location of file didn't change through history (file was not moved), you can use git rev-list <starting-point> -- <path> to get list of revisions touching given path, translate them into names of blobs using <revision>:<path> extended SHA-1 syntax (see git-rev-parse manpage), and feed it to git cat-file. Example:

    $ git rev-list -5 v1.6.0 -- gitweb/README | 
      sed -e 's/$/:gitweb\/README/g' |
      git cat-file --batch-check
    825162a0b6dce8c354de67a30abfbad94d29fdde blob 16067
    6908036402ffe56c8b0cdcebdfb3dfacf84fb6f1 blob 16011
    356ab7b327eb0df99c0773d68375e155dbcea0be blob 14248
    8f7ea367bae72ea3ce25b10b968554f9b842fffe blob 13853
    8dfe335f73c223fa0da8cd21db6227283adb95ba blob 13801
    
    0 讨论(0)
  • 2020-12-03 07:28

    Here is a Bash function that will report the size over time in the following format.

     LoC  Date                       Commit ID   Subject
     942  2019-08-31 18:09:34 +0200  35fc67c122  Declare some XML namespaces in replacement of OGCPrefixMapper, which has been removed from Apache SIS. https://issues.apache.org/jira/browse/SIS-126
     943  2019-08-09 16:52:29 +0200  e8438ab869  fix(GML): fix relative path resolving inside a jar
     934  2019-08-05 15:37:46 +0200  1e0c0b03c4  fix(GML): fix all test cases
     932  2019-07-30 15:54:53 +0200  fddea5db24  feat(GML): work on fallback for non-xsd Feature store
     932  2019-07-23 16:40:23 +0200  8d9a6a7dd0  feat(GML): improve support for custom XML mappings
     932  2019-06-26 15:18:43 +0200  43ea6e0bd7  feat(GML): add concurrency support for read/write operations
     932  2019-06-21 09:27:41 +0200  07a9993b4b  feat(GML): support group reference min/max occurs attributes
     932  2019-06-21 09:27:41 +0200  352a9104ae  feat(GML): fix resolving local files xsd paths
     919  2018-06-08 15:41:26 +0200  01ac7538e7  Merge branch 'master' into sis-migration
     919  2018-05-16 16:40:04 +0200  16fe7590c5  fix(JAXP): various fix for  WFS 2.0.0
     912  2018-04-11 10:09:22 +0200  bf3a38bdc4  chore(*): update JTS version 1.15.0
     912  2017-11-09 20:15:23 +0100  bc14dc4be1  fix(Client): fix minor problems on WFS querying
     901  2017-10-20 11:41:43 +0200  f686d7ff15  feat(Storage): add support for GML 2.1.2
     882  2017-05-16 23:07:31 +0200  f20c34c1e2  refactor(Feature): renamed the Geotk flavor of org.apache.sis.feature package as org.geotoolkit.feature.
    

    Here is the function:

    git-log-size() {
        git rev-list HEAD -- "$1" | while read cid; do
            git cat-file blob "$cid:$1" | wc -l | tr -d '\n'
            echo -n $'\t'
            git log -1 "--pretty=%ci%x09%h%x09%s" $cid
        done | column -t -s$'\t'
    }
    

    It is not particularly efficient, but does the job. It uses some utilities which are pretty common (wc, tr, column).

    The size is reported as lines of code (LoC) since this is the common metric in software development, just change the "-l" option of wc if you prefer something else.

    Here is how to call it:

    git-log-size <path>
    
    0 讨论(0)
提交回复
热议问题