问题
Fork from Don't display ^M (carriage return) in git grep output
In my MinTTY (Cygwin on Windows), git grep
display weird chars instead of accents:
Upon verification, it seems that the filetype is:
ISO-8859 text, with very long lines, with CRLF line terminators
While my MinTTY is set up as UTF-8:
# Text
Font=Powerline Consolas
FontHeight=9
BoldAsFont=yes
BoldAsColour=yes
AllowBlinking=yes
Locale=C
Charset=UTF-8
# Terminal
Term=xterm-256color
Of course, when grepping in files from different repos, we never know in which encoding it is.
Is there a way for Git Grep to behave better?
PS- (Side question) What's the color spec for those accents (here displayed in yellow on blue)?
回答1:
git grep
, much like grep
, displays the contents of the file as it would be in the working tree without any transformation. Unlike grep
, though, it will pipe it through less. less honors your environment for locale settings (e.g., the LC_*
options), and it will render data accordingly.
If your environment is reporting UTF-8 and you have non-UTF-8 data, less
will encode it as you're seeing here, since usually the alternative is either a replacement character or nothing, which isn't very useful when viewing binary files.
Since less
has no clue what encoding is being used and different encodings will map that same byte to different Unicode characters and hence different UTF-8 sequences, there's no way for it to be automatically converted. less
doesn't even know if the file is text or binary. file
makes a guess about what kind of text is in the file, but it doesn't know for certain, and in the general case distinguishing between single-byte encodings requires extensive linguistic knowledge.
So your answer is, no, in the general case, this is not possible.
来源:https://stackoverflow.com/questions/59398963/could-git-correctly-display-iso-latin-1-accents-in-a-utf-8-terminal