Using PowerShell to write a file in UTF-8 without the BOM

前端 未结 13 1103
名媛妹妹
名媛妹妹 2020-11-22 06:25

Out-File seems to force the BOM when using UTF-8:

$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding \"UTF8\" $MyPath

相关标签:
13条回答
  • 2020-11-22 07:03
        [System.IO.FileInfo] $file = Get-Item -Path $FilePath 
        $sequenceBOM = New-Object System.Byte[] 3 
        $reader = $file.OpenRead() 
        $bytesRead = $reader.Read($sequenceBOM, 0, 3) 
        $reader.Dispose() 
        #A UTF-8+BOM string will start with the three following bytes. Hex: 0xEF0xBB0xBF, Decimal: 239 187 191 
        if ($bytesRead -eq 3 -and $sequenceBOM[0] -eq 239 -and $sequenceBOM[1] -eq 187 -and $sequenceBOM[2] -eq 191) 
        { 
            $utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False) 
            [System.IO.File]::WriteAllLines($FilePath, (Get-Content $FilePath), $utf8NoBomEncoding) 
            Write-Host "Remove UTF-8 BOM successfully" 
        } 
        Else 
        { 
            Write-Warning "Not UTF-8 BOM file" 
        }  
    

    Source How to remove UTF8 Byte Order Mark (BOM) from a file using PowerShell

    0 讨论(0)
  • 2020-11-22 07:07

    This script will convert, to UTF-8 without BOM, all .txt files in DIRECTORY1 and output them to DIRECTORY2

    foreach ($i in ls -name DIRECTORY1\*.txt)
    {
        $file_content = Get-Content "DIRECTORY1\$i";
        [System.IO.File]::WriteAllLines("DIRECTORY2\$i", $file_content);
    }
    
    0 讨论(0)
  • 2020-11-22 07:08

    This one works for me (use "Default" instead of "UTF8"):

    $MyFile = Get-Content $MyPath
    $MyFile | Out-File -Encoding "Default" $MyPath
    

    The result is ASCII without BOM.

    0 讨论(0)
  • 2020-11-22 07:11

    Change multiple files by extension to UTF-8 without BOM:

    $Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
    foreach($i in ls -recurse -filter "*.java") {
        $MyFile = Get-Content $i.fullname 
        [System.IO.File]::WriteAllLines($i.fullname, $MyFile, $Utf8NoBomEncoding)
    }
    
    0 讨论(0)
  • 2020-11-22 07:13

    I figured this wouldn't be UTF, but I just found a pretty simple solution that seems to work...

    Get-Content path/to/file.ext | out-file -encoding ASCII targetFile.ext
    

    For me this results in a utf-8 without bom file regardless of the source format.

    0 讨论(0)
  • 2020-11-22 07:15

    When using Set-Content instead of Out-File, you can specify the encoding Byte, which can be used to write a byte array to a file. This in combination with a custom UTF8 encoding which does not emit the BOM gives the desired result:

    # This variable can be reused
    $utf8 = New-Object System.Text.UTF8Encoding $false
    
    $MyFile = Get-Content $MyPath -Raw
    Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
    

    The difference to using [IO.File]::WriteAllLines() or similar is that it should work fine with any type of item and path, not only actual file paths.

    0 讨论(0)
提交回复
热议问题