I\'m studying through man gitglossary
, and this one term has eluded me—because it isn\'t defined in the glossary at all.
It\'s referred to only twice (aster
Regarding git itself, the first mention of an "alternate object database location" was done in commit ace1534 (May 2005, git v0.99)
Introduce SHA1_FILE_DIRECTORIES to support multiple object databases.
SHA1_FILE_DIRECTORIES
environment variable is a colon separated paths used when looking for SHA1 files not found in the usual place for reading. Creating a new SHA1 file does not use this alternate object database location mechanism. This is useful to archive older, rarely used objects into separate directories.
That was a first example, quickly removed from git (in Sept 2005, commit a9ab586)
The alternate object database struct
was formally introduced in commit 9a217f2 (June 2005, v0.99) in cache.h#L236-L239.
Today (most recent cache.h), that struct
is still there, but this time with a chaining mechanism, introduced in Aug. 2005, v0.99.5, commit d5a63b9.
extern struct alternate_object_database {
struct alternate_object_database *next;
char *name;
char base[FLEX_ARRAY]; /* more */
} *alt_odb_list;
Prepare alternate object database registry.
The variable
alt_odb_list
points at the list ofstruct alternate_object_database
.The elements on this list come from non-empty elements from colon separated
ALTERNATE_DB_ENVIRONMENT
environment variable, andGIT_OBJECT_DIRECTORY/info/alternates
, whose contents is exactly in the same format as that environment variable.Its base points at a statically allocated buffer that contains "
/the/directory/corresponding/to/.git/objects/...
", while its name points just after the slash at the end of ".git/objects/
" in the example above, and has enough space to hold 40-byte hex SHA1, an extra slash for the first level indirection, and the terminating NUL.
That is probably the closest definition of the "alternates mechanism" you can find in git sources.
You can see an example of an alternate database implementation in libgit2 (Libgit2 is an implementation of Git written in pure C)
There are just two main structures in the heart of a Git repo, on which everything is based: There is the object database and there is the ref database.
The object database is where all the data is stored. The contents of all files, the structures of directories, the commits, everything, goes in the object database. However, what's remarkable about the object database is that it's essentially nothing but a key-value store.
Git stores data in the object database using a hash-based retrieval, meaning that the keys of the store are the (SHA1) hashes of the values.
That has some interesting further implications: The values in the object database are essentially immutable and you don't need an update operation.
instead of storing the object database and the ref database in the way Git usually does it – in flat files – you can provide your own backend implementation and do whatever you want.
Git traditionally supports:
odb_loose
implements the loose file format backend. It accesses each object in a separate file within the objects directory, with the name of each file corresponding to the SHA1 hash of its contents.odb_pack
implements the packfile backend. It accesses the objects in Git packfiles, which is a file format used for both space-efficient storage of objects, and for transferring the objects when pushing or pulling.
(see also "Is the git binary diff algorithm (delta storage) standardized?")