I thought it would be neat if it were possible to take a Git repository, run some script, and have it generate the number of lines in the code base, and the proportion of each a
You could try to parse the output of git-blame. This command gives the last person that edited each line of a file.
This example is not exactly what you want but I think it gives you the idea:
git blame -e the/file | awk -F '<|>' '{print $2}' | sort | uniq -c
This will print the e-mail addresses of the authors together with the number of lines they modified lastly for a file, for example:
47 foo@bar.com
34712 blah@baz.com
To make it run on the whole repository, you can do something like this:
git ls-files | while read f; do git blame -e $f; done | awk -F '<|>' '{print $2}' | sort | uniq -c
The idea here is to first generate the list of files with git ls-files, and then run the above snippet on each of the files (using the snippet mentioned here). If you're running this on a large codebase, you may want to store intermediate results in temporary files rather than use pipes.