How to stop git from breaking encoding on checkout

心不动则不痛 提交于 2019-12-10 18:05:25

问题


I recently added a .gitattributes file to a c# repository with the following settings:

*            text=auto
*.cs         text diff=csharp

I renormalized the repository following these instructions from github and it seemed to work OK.

The problem I have is when I checkout some files (not all of them) I see lots of weird characters mixed in with the actual code. It seems to happen when git runs the files through the lf->crlf conversion specified by the .gitattributes file above.

According to Notepad++ the files that get messed up are using UCS-2 Little Endian or UCS-2 Big Endian encoding. The files that seem to work OK are either ANSI or UTF-8 encoded.

For reference my git version is 1.8.0.msysgit.0 and my OS is Windows 8.

Any ideas how I can fix this? Would changing the encoding of the files be enough?


回答1:


This happens if you use an encoding where every character is two bytes.
CRLF would then be encoded as \0\r\0\n.

Git thinks it's a single-byte encoding, so it turns that into \0\r\0\r\n.
This makes the next line one byte off, causing every other line be full of Chinese. (because the \0 becomes the low-order byte rather than the high-order byte)

You can convert files to UTF8 using this LINQPad script:

const string path = @"C:\...";
foreach (var file in Directory.EnumerateFiles(path, "*", SearchOption.AllDirectories))
{
    if (!new [] { ".html", ".js"}.Contains(Path.GetExtension(file)))
        continue;
    File.WriteAllText(file, String.Join("\r\n", File.ReadAllLines(file)), new UTF8Encoding(encoderShouldEmitUTF8Identifier: true));
    file.Dump();
}

This will not fix broken files; you can fix the files by replacing \r\n with \n in a hex editor. I don't have a LINQPad script for that. (since there's no simple Replace() method for byte[]s)




回答2:


To fix this, either convert the encoding of the files (UTF-8 should be ok) or disable the line break auto conversion (git config core.autocrlf false and .gitattributes stuff you have).



来源:https://stackoverflow.com/questions/13704936/how-to-stop-git-from-breaking-encoding-on-checkout

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!