What does “check out code” mean in git documentation for line endings?

谁都会走 提交于 2019-12-06 07:27:03

(Note: I'm trying to answer the underlying question, which seems to really be: If checkout means git checkout, why do I get these messages during git add?)

The documentation on this is all a little bit sloppy, possibly on purpose because the exactly-correct details are somewhat obscure. To understand it well on a conceptual level, you should view line-ending-modification as part of the more general smudge and clean filtering (since this is in fact how it's implemented).

In Git, every file you can work with at the moment exists simultaneously in three places:

the HEAD commit      the index       the work-tree
---------------      ---------       -------------
README.md            README.md       README.md
file.txt             file.txt        file.txt

Files can be copied in various directions, except that all commits are read-only at all times. So you can copy from the HEAD commit into the index, or from the index into the work-tree. You can also copy from the work-tree into the index.

(You can also make a new commit from the index. This leaves the old HEAD commit alone, and the new commit becomes the HEAD commit. So after making a new commit, the HEAD commit and the index match. This is not because we modified any commit; we can't do that. It's because we have added a new commit, made from the index, and then we stop calling the old commit the HEAD and call the new one the HEAD instead.)

Note that the index sits "in the way" between HEAD and work-tree. In order to copy any file from HEAD to work-tree, it must first pass through the index. In order to make a new commit from the work-tree, each new file must pass through the index. Hence, the index/work-tree transitions are where cleaning and smudging take place.

To "clean" a file means to make it ready for committing. This cleaning process can, for instance, translate CRLF line endings into LF-only line endings. Or, using the ident filter, you can un-make many substitutions, or write your own filter to do virtually anything. To smudge a file means to make it ready for editing and/or use in the work-tree. This can, for instance, translate LF-only line endings into CRLF-endings. As with the cleaning process, you can use the ident filter or your own filter-driver to do anything you want. Git LFS uses these drivers to swap short references and entire file contents.

Hence, the exact answer is that line ending conversions are applied during those processes that copy files into or out of the index. The most common are these two:

  • git add copies from work-tree into index.
  • git checkout extracts to work-tree, from either commit into index and then to work-tree, or straight from index to work-tree.

It's only at these times that any of these CRLF-to-LF or LF-to-CRLF conversions occur. But Git has extra code that tries to intuit whether doing these conversions later will result in a change to existing committed data, even if it has not done them yet. That code will give you the warning messages you are seeing:

warning: LF will be replaced by CRLF ...
warning: CRLF will be replaced by LF ...

These warnings come out if you enable the "safe crlf" option. Because they come from different code run at different times, everything can be very confusing.

Note that those end of line conversion warnings, seen when core.safecrlf is set to true or warn, won't always work on checkout, as it can be incorrectly triggered for contents that does not use CRLF as line endings.

This has been fixed in Git 2.16 (Q1 2018)

See commit 649f1f0 (08 Dec 2017), and commit 86ff70a (26 Nov 2017) by Torsten Bögershausen (tboegi).
(Merged by Junio C Hamano -- gitster -- in commit 720b176, 27 Dec 2017)

convert: tighten the safe autocrlf handling

When a text file had been commited with CRLF and the file is commited again, the CRLF are kept if .gitattributs has "text=auto".
This is done by analyzing the content of the blob stored in the index: If a '\r' is found, Git assumes that the blob was commited with CRLF.

The simple search for a '\r' does not always work as expected:
A file is encoded in UTF-16 with CRLF and commited. Git treats it as binary.
Now the content is converted into UTF-8. At the next commit Git treats the file as text, the CRLF should be converted into LF, but isn't.

Replace has_cr_in_index() with has_crlf_in_index(). When no '\r' is found, 0 is returned directly, this is the most common case.
If a '\r' is found, the content is analyzed more deeply.

Consider a library. When you check a book out of the library, no other patron can have access to the book until you check it back in.

Older centralized version control systems worked similarly; when you "checked out" a file, the system placed a lock on that file and no other user could check out the same file until you checked it back in.

Newer version control systems tend to allow multiple users to work on a file at the same time, requiring incompatible changes to be merged together rather than strictly serializing access to a file. The term "check out" is still used, though, to indicate the process of copying a file from the repository into your working directory.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!