Get encoding of a file in Windows

前端 未结 12 1925
忘了有多久
忘了有多久 2020-11-22 12:16

This isn\'t really a programming question, is there a command line or Windows tool (Windows 7) to get the current encoding of a text file? Sure I can write a little C# app b

12条回答
  •  花落未央
    2020-11-22 12:33

    Here's my take how to detect the Unicode family of text encodings via BOM. The accuracy of this method is low, as this method only works on text files (specifically Unicode files), and defaults to ascii when no BOM is present (like most text editors, the default would be UTF8 if you want to match the HTTP/web ecosystem).

    Update 2018: I no longer recommend this method. I recommend using file.exe from GIT or *nix tools as recommended by @Sybren, and I show how to do that via PowerShell in a later answer.

    # from https://gist.github.com/zommarin/1480974
    function Get-FileEncoding($Path) {
        $bytes = [byte[]](Get-Content $Path -Encoding byte -ReadCount 4 -TotalCount 4)
    
        if(!$bytes) { return 'utf8' }
    
        switch -regex ('{0:x2}{1:x2}{2:x2}{3:x2}' -f $bytes[0],$bytes[1],$bytes[2],$bytes[3]) {
            '^efbbbf'   { return 'utf8' }
            '^2b2f76'   { return 'utf7' }
            '^fffe'     { return 'unicode' }
            '^feff'     { return 'bigendianunicode' }
            '^0000feff' { return 'utf32' }
            default     { return 'ascii' }
        }
    }
    
    dir ~\Documents\WindowsPowershell -File | 
        select Name,@{Name='Encoding';Expression={Get-FileEncoding $_.FullName}} | 
        ft -AutoSize
    

    Recommendation: This can work reasonably well if the dir, ls, or Get-ChildItem only checks known text files, and when you're only looking for "bad encodings" from a known list of tools. (i.e. SQL Management Studio defaults to UTF16, which broke GIT auto-cr-lf for Windows, which was the default for many years.)

提交回复
热议问题