I\'m currently thinking of changing my VCS (from subversion) to git. Is it possible to limit the file size within a commit in a git repository? For e. g. subversion there is
if you are using gitolite you can also try VREF. There is one VREF already provided by default (the code is in gitolite/src/VREF/MAX_NEWBIN_SIZE). It is called MAX_NEWBIN_SIZE. It works like this:
repo name
RW+ = username
- VREF/MAX_NEWBIN_SIZE/1000 = usernames
Where 1000 is example threshold in Bytes.
This VREF works like a update hook and it will reject your push if one file you are to push is greater than the threshold.
This is going to be a very rare case from what I have seen when some one checks in, say a 200Mb or even more size file.
While you can prevent this from happening by using server side hooks ( not sure about client side hooks since you have to rely on the person having the hooks installed ) much like how you would in SVN, you also have to take into account that in Git, it is much much easier to remove such a file / commit from the repository. You did not have such a luxury in SVN, atleast not an easy way.
As I was struggling with it for a while, even with the description, and I think this is relevant for others too, I thought I'd post an implementation of how what J16 SDiZ described could be implemented.
So, my take on the server-side update hook preventing too big files to be pushed:
#!/bin/bash
# Script to limit the size of a push to git repository.
# Git repo has issues with big pushes, and we shouldn't have a real need for those
#
# eis/02.02.2012
# --- Safety check, should not be run from command line
if [ -z "$GIT_DIR" ]; then
echo "Don't run this script from the command line." >&2
echo " (if you want, you could supply GIT_DIR then run" >&2
echo " $0 <ref> <oldrev> <newrev>)" >&2
exit 1
fi
# Test that tab replacement works, issue in some Solaris envs at least
testvariable=`echo -e "\t" | sed 's/\s//'`
if [ "$testvariable" != "" ]; then
echo "Environment check failed - please contact git hosting." >&2
exit 1
fi
# File size limit is meant to be configured through 'hooks.filesizelimit' setting
filesizelimit=$(git config hooks.filesizelimit)
# If we haven't configured a file size limit, use default value of about 100M
if [ -z "$filesizelimit" ]; then
filesizelimit=100000000
fi
# Reference to incoming checkin can be found at $3
refname=$3
# With this command, we can find information about the file coming in that has biggest size
# We also normalize the line for excess whitespace
biggest_checkin_normalized=$(git ls-tree --full-tree -r -l $refname | sort -k 4 -n -r | head -1 | sed 's/^ *//;s/ *$//;s/\s\{1,\}/ /g' )
# Based on that, we can find what we are interested about
filesize=`echo $biggest_checkin_normalized | cut -d ' ' -f4,4`
# Actual comparison
# To cancel a push, we exit with status code 1
# It is also a good idea to print out some info about the cause of rejection
if [ $filesize -gt $filesizelimit ]; then
# To be more user-friendly, we also look up the name of the offending file
filename=`echo $biggest_checkin_normalized | cut -d ' ' -f5,5`
echo "Error: Too large push attempted." >&2
echo >&2
echo "File size limit is $filesizelimit, and you tried to push file named $filename of size $filesize." >&2
echo "Contact configuration team if you really need to do this." >&2
exit 1
fi
exit 0
Note that it's been commented that this code only checks the latest commit, so this code would need to be tweaked to iterate commits between $2 and $3 and do the check to all of them.
Yes, git has hooks as well (git hooks). But it kind of depends on the actually work-flow you will be using.
If you have inexperienced users, it is much safer to pull, then to let them push. That way, you can make sure they won't screw up the main repository.
You can use a hook, either pre-commit
hook (on client), or a update
hook (on server). Do a git ls-files --cached
(for pre-commit) or git ls-tree --full-tree -r -l $3
(for update) and act accordingly.
git ls-tree -l
would give something like this:
100644 blob 97293e358a9870ac4ddf1daf44b10e10e8273d57 3301 file1
100644 blob 02937b0e158ff8d3895c6e93ebf0cbc37d81cac1 507 file2
Grab the forth column, and it is the size. Use git ls-tree --full-tree -r -l HEAD | sort -k 4 -n -r | head -1
to get the largest file. cut
to extract, if [ a -lt b ]
to check size, etc..
Sorry, I think if you are a programmer, you should be able to do this yourself.
I am using gitolite and the update hook was already being used - instead of using the update hook, I used the pre-receive hook. The script posted by Chriki worked fabulously with the exception that the data is passed via stdin - so I made one line change:
- refname=$3
+ read a b refname
(there may be a more elegant way to do that but it works)