Can someone clarify the meaning of these terms? Are tracked files any files that have, at some point, been added to the stage? Is the \"index\" the same as the \"stage\"? Are al
There are three things to consider here: the current commit (known variously as HEAD
or @
), the index, and the work-tree.
The index is also called the staging area and the cache. These represent its various functions, because the index does more than just hold the contents of the proposed next commit. Its use as a cache is mostly invisible, though: you just use Git, and the cache tricks that make Git go fast, are all done under the hood with no manual intervention necessary. So you only need "cached" to remember that some commands use --cached
, e.g., git diff --cached
and git rm --cached
. Some of these have additional names (git diff --staged
), and some don't.
Git is not very consistent about where it uses each of these terms, so you must simply memorize them. One issue seems to be that for many users, "the index" is mysterious. This is probably because you can't see it directly, except using git ls-files
(which is not a user-friendly command: it's meant for programming, not for daily use).
Note that the work-tree (also called the working tree and sometimes the work directory or working directory) is quite separate from the index. You can see, and modify, files in the work-tree quite easily.
I once thought "tracked" was more complicated, but it turns out that tracked quite literally means is in the index. A file is tracked if and only if git ls-files
shows that it will be in the next commit.
You cannot see files in the index so easily—but you can copy from the work-tree, into the index, easily, using git add
:
git add path/to/file.txt
copies the file from the work-tree into the index. If it was not already in the index (was not tracked), it is now in the index (is tracked).
Hence:
Are tracked files any files that have, at some point, been added to the stage?
No! Tracked files are files that are in the index right now. It does not matter what has happened in the past, in any commit, or at any point in the past. If some path path/to/file.txt
is present in the index right now, that file is tracked. If not, it is not tracked (and is potentially also ignored).
If path/to/file.txt
is in the index now, and you take it out, the file is no longer tracked. It may or may not be in any existing commits, and it may or may not still be in the work-tree.
Is the "index" the same as the "stage"?
Yes, more or less. Various documentation and people are not very consistent about this.
Are all staged files tracked, but the reverse is not necessarily true (namely, files that were once staged and committed, but aren't part of the current stage to be committed)?
This question doesn't quite make sense, since "the staging area" is the index. I think staged doesn't have a perfectly-defined meaning, but I would define it this way. A file is staged if:
@
/ HEAD, but is in the index, or@
/ HEAD and the index, and is different in the two.Equivalently, you could say "when some path is being called staged, that means that if I make a new commit right now, the new commit's version of that file will be different from the current commit's version." Note that if you have not touched a file in any way, so that it's in the current commit and in the index and in the work-tree, but all three versions match, the file is still going to get committed. It's just neither "staged" nor "modified".
How do I know which files are tracked?
While git ls-files
can tell you, the usual way to find out is indirect: you run git status
.
How do I know which files are staged?
Assuming the definition above, you must ask Git to diff
the current commit (HEAD / @
) and the index. Whatever is different between them is "staged". Running git status
will do this diff for you, and report the names of such files (without showing detailed diffs).
To get the detailed diffs, you can run git diff --cached
, which compares HEAD
vs index. This also has the name git diff --staged
(which is a better name—but, perhaps just to be annoying, --staged
is not available as an option to git rm
!).
Because there are three copies of every file, you need two diffs to see what is going on:
git diff --cached
git diff
Running git status
runs both of these git diff
-s for you, and summarizes them. You can get an even shorter summary with git status --short
, where you will see things like:
M a.txt
M b.txt
MM c.txt
The first column is the result of comparing HEAD
vs index: a blank means the two match, an M
means HEAD
and index
differ. The second column is the result of comparing index vs work-tree: a blank means the two match, an M
means they differ. The two M
s in a row mean all three versions of c.txt
are different. You can't see the one in the index directly, but you can git diff
it!