Can you get the number of lines of code from a GitHub repository?

前端 未结 15 1110
庸人自扰
庸人自扰 2020-12-04 04:26

In a GitHub repository you can see “language statistics”, which displays the percentage of the project that’s written in a language. It doesn’t, however, display ho

相关标签:
15条回答
  • 2020-12-04 04:58

    A shell script, cloc-git

    You can use this shell script to count the number of lines in a remote Git repository with one command:

    #!/usr/bin/env bash
    git clone --depth 1 "$1" temp-linecount-repo &&
      printf "('temp-linecount-repo' will be deleted automatically)\n\n\n" &&
      cloc temp-linecount-repo &&
      rm -rf temp-linecount-repo
    

    Installation

    This script requires CLOC (“Count Lines of Code”) to be installed. cloc can probably be installed with your package manager – for example, brew install cloc with Homebrew. There is also a docker image published under mribeiro/cloc.

    You can install the script by saving its code to a file cloc-git, running chmod +x cloc-git, and then moving the file to a folder in your $PATH such as /usr/local/bin.

    Usage

    The script takes one argument, which is any URL that git clone will accept. Examples are https://github.com/evalEmpire/perl5i.git (HTTPS) or git@github.com:evalEmpire/perl5i.git (SSH). You can get this URL from any GitHub project page by clicking “Clone or download”.

    Example output:

    $ cloc-git https://github.com/evalEmpire/perl5i.git
    Cloning into 'temp-linecount-repo'...
    remote: Counting objects: 200, done.
    remote: Compressing objects: 100% (182/182), done.
    remote: Total 200 (delta 13), reused 158 (delta 9), pack-reused 0
    Receiving objects: 100% (200/200), 296.52 KiB | 110.00 KiB/s, done.
    Resolving deltas: 100% (13/13), done.
    Checking connectivity... done.
    ('temp-linecount-repo' will be deleted automatically)
    
    
         171 text files.
         166 unique files.                                          
          17 files ignored.
    
    http://cloc.sourceforge.net v 1.62  T=1.13 s (134.1 files/s, 9764.6 lines/s)
    -------------------------------------------------------------------------------
    Language                     files          blank        comment           code
    -------------------------------------------------------------------------------
    Perl                           149           2795           1425           6382
    JSON                             1              0              0            270
    YAML                             2              0              0            198
    -------------------------------------------------------------------------------
    SUM:                           152           2795           1425           6850
    -------------------------------------------------------------------------------
    

    Alternatives

    Run the commands manually

    If you don’t want to bother saving and installing the shell script, you can run the commands manually. An example:

    $ git clone --depth 1 https://github.com/evalEmpire/perl5i.git
    $ cloc perl5i
    $ rm -rf perl5i
    

    Linguist

    If you want the results to match GitHub’s language percentages exactly, you can try installing Linguist instead of CLOC. According to its README, you need to gem install linguist and then run linguist. I couldn’t get it to work (issue #2223).

    0 讨论(0)
  • 2020-12-04 05:00

    Open terminal and run the following:

    curl https://api.codetabs.com/v1/loc?github=username/reponame
    
    0 讨论(0)
  • 2020-12-04 05:02

    You can run something like

    git ls-files | xargs wc -l
    

    which will give you the total count →

    Or use this tool → http://line-count.herokuapp.com/

    0 讨论(0)
  • 2020-12-04 05:07

    If you go to the graphs/contributors page, you can see a list of all the contributors to the repo and how many lines they've added and removed.

    Unless I'm missing something, subtracting the aggregate number of lines deleted from the aggregate number of lines added among all contributors should yield the total number of lines of code in the repo. (EDIT: it turns out I was missing something after all. Take a look at orbitbot's comment for details.)

    UPDATE:

    This data is also available in GitHub's API. So I wrote a quick script to fetch the data and do the calculation:

    'use strict';
    
    function countGithub(repo) {
    fetch('https://api.github.com/repos/'+repo+'/stats/contributors')
        .then(response => response.json())
        .then(contributors => contributors
            .map(contributor => contributor.weeks
                .reduce((lineCount, week) => lineCount + week.a - week.d, 0)))
        .then(lineCounts => lineCounts.reduce((lineTotal, lineCount) => lineTotal + lineCount))
        .then(lines => window.alert(lines));
    }
    
    countGithub('jquery/jquery'); // or count anything you like

    Just paste it in a Chrome DevTools snippet, change the repo and click run.

    Disclaimer (thanks to lovasoa):

    Take the results of this method with a grain of salt, because for some repos (sorich87/bootstrap-tour) it results in negative values, which might indicate there's something wrong with the data returned from GitHub's API.

    UPDATE:

    Looks like this method to calculate total line numbers isn't entirely reliable. Take a look at orbitbot's comment for details.

    0 讨论(0)
  • 2020-12-04 05:07

    You can use GitHub API to get the sloc like the following function

    function getSloc(repo, tries) {
    
        //repo is the repo's path
        if (!repo) {
            return Promise.reject(new Error("No repo provided"));
        }
    
        //GitHub's API may return an empty object the first time it is accessed
        //We can try several times then stop
        if (tries === 0) {
            return Promise.reject(new Error("Too many tries"));
        }
    
        let url = "https://api.github.com/repos" + repo + "/stats/code_frequency";
    
        return fetch(url)
            .then(x => x.json())
            .then(x => x.reduce((total, changes) => total + changes[1] + changes[2], 0))
            .catch(err => getSloc(repo, tries - 1));
    }
    

    Personally I made an chrome extension which shows the number of SLOC on both github project list and project detail page. You can also set your personal access token to access private repositories and bypass the api rate limit.

    You can download from here https://chrome.google.com/webstore/detail/github-sloc/fkjjjamhihnjmihibcmdnianbcbccpnn

    Source code is available here https://github.com/martianyi/github-sloc

    0 讨论(0)
  • 2020-12-04 05:07

    Hey all this is ridiculously easy...

    1. Create a new branch from your first commit
    2. When you want to find out your stats, create a new PR from main
    3. The PR will show you the number of changed lines - as you're doing a PR from the first commit all your code will be counted as new lines

    And the added benefit is that if you don't approve the PR and just leave it in place, the stats (No of commits, files changed and total lines of code) will simply keep up-to-date as you merge changes into main. :) Enjoy.

    0 讨论(0)
提交回复
热议问题