For a nodejs project I need to determinate the hash of my folder to check the version. Actually, I made a script to test my code (without filesystem, directly of git api for my
Your code contains a problem, however fixing it doesn't remove the discrepancy in the A2 case.
The official git algorithm for computing hashes on trees drops leading zeros from the mode field. In your examples, that field contains values 100644
and 040000
, and the latter is recorded by git as 40000
.
Proof:
$ git cat-file tree 4ef57de8e81c8415d6da2b267872e602b1f28cfe|hexdump -C
00000000 31 30 30 36 34 34 20 2e 63 6f 76 65 72 61 67 65 |100644 .coverage|
00000010 72 63 00 44 91 70 d0 fa eb 75 18 23 10 34 55 64 |rc.D.p...u.#.4Ud|
00000020 fd 18 11 c0 b9 fd 73 31 30 30 36 34 34 20 2e 65 |......s100644 .e|
00000030 64 69 74 6f 72 63 6f 6e 66 69 67 00 75 88 49 36 |ditorconfig.u.I6|
00000040 ea 2d 35 b5 31 af 88 6a ca d7 47 d4 fd 9b 2a 9e |.-5.1..j..G...*.|
00000050 31 30 30 36 34 34 20 2e 66 6c 61 6b 65 38 00 69 |100644 .flake8.i|
00000060 e8 72 e3 0d 30 f5 c7 de 32 76 d2 89 d6 ae e8 1c |.r..0...2v......|
00000070 cf 4a f7 34 30 30 30 30 20 2e 67 69 74 68 75 62 |.J.40000 .github|
... ^^^^^
... !!!!!
But adding removal of leading zeros1 to your perl script still doesn't fix the A2 case (although the computed hash changes, it is still different from the expected one):
$ cat main.sh
XX="$(perl -sane '$F[2] =~ s/(..)/\\x$1/g ; $F[0] =~ s/^0+//g ; print $F[0]." ".$F[1]."\\"."x00".$F[2]' output_a1)"
SIZE=$(echo -en "$XX" | wc -c)
echo "original: 8d66139b3acf78fa50e16383693a161c33b5e048 output:"
echo -en "tree $SIZE\x00$XX" | sha1sum
XX="$(perl -sane '$F[2] =~ s/(..)/\\x$1/g ; $F[0] =~ s/^0+//g ; print $F[0]." ".$F[1]."\\"."x00".$F[2]' output_a2)"
SIZE=$(echo -en "$XX" | wc -c)
echo "original: 4ef57de8e81c8415d6da2b267872e602b1f28cfe output:"
echo -en "tree $SIZE\x00$XX" | sha1sum
$ ./main.sh
original: 8d66139b3acf78fa50e16383693a161c33b5e048 output:
8d66139b3acf78fa50e16383693a161c33b5e048 -
original: 4ef57de8e81c8415d6da2b267872e602b1f28cfe output:
c5c701b8114582e3bb2e353aac157a7febfcd33b -
The explanation is that the A2 hash 4ef57de8e81c8415d6da2b267872e602b1f28cfe points to a commit object rather than a tree. That commit object in turn refers to the tree with hash c5c701b8114582e3bb2e353aac157a7febfcd33b, which is exactly the value computed by the fixed code:
$ git cat-file -t 4ef57de8e81c8415d6da2b267872e602b1f28cfe
commit
$ git cat-file -p 4ef57de8e81c8415d6da2b267872e602b1f28cfe
tree c5c701b8114582e3bb2e353aac157a7febfcd33b
parent 502a88b41161ec7dbff0862e3d805db397caf366
...
Had you used for the A2 query the tree rather than the commit hash (try this) you wouldn't have had any problem in the first place.
An arguable issue with the GitHub API is that it silently resolves a commit hash to the underlying tree instead of returning an error or including in the response an indication of what happened (for example, by setting the sha
field to the hash of the tree rather than the query value).
1 The quick&dirty fix won't work correctly in one case, when the mode field consists of only zeros. In that case the mode field will be completely erased instead of being replaced by a single zero. However that case cannot occur in practice, since an object with such a mode value would be simply inaccessible to git.