Get encoding of a file in Windows

前端 未结 12 1896
忘了有多久
忘了有多久 2020-11-22 12:16

This isn\'t really a programming question, is there a command line or Windows tool (Windows 7) to get the current encoding of a text file? Sure I can write a little C# app b

相关标签:
12条回答
  • 2020-11-22 12:33

    Here's my take how to detect the Unicode family of text encodings via BOM. The accuracy of this method is low, as this method only works on text files (specifically Unicode files), and defaults to ascii when no BOM is present (like most text editors, the default would be UTF8 if you want to match the HTTP/web ecosystem).

    Update 2018: I no longer recommend this method. I recommend using file.exe from GIT or *nix tools as recommended by @Sybren, and I show how to do that via PowerShell in a later answer.

    # from https://gist.github.com/zommarin/1480974
    function Get-FileEncoding($Path) {
        $bytes = [byte[]](Get-Content $Path -Encoding byte -ReadCount 4 -TotalCount 4)
    
        if(!$bytes) { return 'utf8' }
    
        switch -regex ('{0:x2}{1:x2}{2:x2}{3:x2}' -f $bytes[0],$bytes[1],$bytes[2],$bytes[3]) {
            '^efbbbf'   { return 'utf8' }
            '^2b2f76'   { return 'utf7' }
            '^fffe'     { return 'unicode' }
            '^feff'     { return 'bigendianunicode' }
            '^0000feff' { return 'utf32' }
            default     { return 'ascii' }
        }
    }
    
    dir ~\Documents\WindowsPowershell -File | 
        select Name,@{Name='Encoding';Expression={Get-FileEncoding $_.FullName}} | 
        ft -AutoSize
    

    Recommendation: This can work reasonably well if the dir, ls, or Get-ChildItem only checks known text files, and when you're only looking for "bad encodings" from a known list of tools. (i.e. SQL Management Studio defaults to UTF16, which broke GIT auto-cr-lf for Windows, which was the default for many years.)

    0 讨论(0)
  • 2020-11-22 12:33

    A simple solution might be opening the file in Firefox.

    1. Drag and drop the file into firefox
    2. Right click on the page
    3. Select "View Page Info"

    and the text encoding will appear on the "Page Info" window.

    Note: If the file is not in txt format, just rename it to txt and try again.

    P.S. For more info see this article.

    0 讨论(0)
  • 2020-11-22 12:36

    The only way that I have found to do this is VIM or Notepad++.

    0 讨论(0)
  • 2020-11-22 12:38

    I wrote the #4 answer (at time of writing). But lately I have git installed on all my computers, so now I use @Sybren's solution. Here is a new answer that makes that solution handy from powershell (without putting all of git/usr/bin in the PATH, which is too much clutter for me).

    Add this to your profile.ps1:

    $global:gitbin = 'C:\Program Files\Git\usr\bin'
    Set-Alias file.exe $gitbin\file.exe
    

    And used like: file.exe --mime-encoding *. You must include .exe in the command for PS alias to work.

    But if you don't customize your PowerShell profile.ps1 I suggest you start with mine: https://gist.github.com/yzorg/8215221/8e38fd722a3dfc526bbe4668d1f3b08eb7c08be0 and save it to ~\Documents\WindowsPowerShell. It's safe to use on a computer without git, but will write warnings when git is not found.

    The .exe in the command is also how I use C:\WINDOWS\system32\where.exe from powershell; and many other OS CLI commands that are "hidden by default" by powershell, *shrug*.

    0 讨论(0)
  • 2020-11-22 12:42

    If you have "git" or "Cygwin" on your Windows Machine, then go to the folder where your file is present and execute the command:

    file *
    

    This will give you the encoding details of all the files in that folder.

    0 讨论(0)
  • 2020-11-22 12:42

    The (Linux) command-line tool 'file' is available on Windows via GnuWin32:

    http://gnuwin32.sourceforge.net/packages/file.htm

    If you have git installed, it's located in C:\Program Files\git\usr\bin.

    Example:

        C:\Users\SH\Downloads\SquareRoot>file *
        _UpgradeReport_Files;         directory
        Debug;                        directory
        duration.h;                   ASCII C++ program text, with CRLF line terminators
        ipch;                         directory
        main.cpp;                     ASCII C program text, with CRLF line terminators
        Precision.txt;                ASCII text, with CRLF line terminators
        Release;                      directory
        Speed.txt;                    ASCII text, with CRLF line terminators
        SquareRoot.sdf;               data
        SquareRoot.sln;               UTF-8 Unicode (with BOM) text, with CRLF line terminators
        SquareRoot.sln.docstates.suo; PCX ver. 2.5 image data
        SquareRoot.suo;               CDF V2 Document, corrupt: Cannot read summary info
        SquareRoot.vcproj;            XML  document text
        SquareRoot.vcxproj;           XML document text
        SquareRoot.vcxproj.filters;   XML document text
        SquareRoot.vcxproj.user;      XML document text
        squarerootmethods.h;          ASCII C program text, with CRLF line terminators
        UpgradeLog.XML;               XML  document text
    
        C:\Users\SH\Downloads\SquareRoot>file --mime-encoding *
        _UpgradeReport_Files;         binary
        Debug;                        binary
        duration.h;                   us-ascii
        ipch;                         binary
        main.cpp;                     us-ascii
        Precision.txt;                us-ascii
        Release;                      binary
        Speed.txt;                    us-ascii
        SquareRoot.sdf;               binary
        SquareRoot.sln;               utf-8
        SquareRoot.sln.docstates.suo; binary
        SquareRoot.suo;               CDF V2 Document, corrupt: Cannot read summary infobinary
        SquareRoot.vcproj;            us-ascii
        SquareRoot.vcxproj;           utf-8
        SquareRoot.vcxproj.filters;   utf-8
        SquareRoot.vcxproj.user;      utf-8
        squarerootmethods.h;          us-ascii
        UpgradeLog.XML;               us-ascii
    
    0 讨论(0)
提交回复
热议问题