Out-File
seems to force the BOM when using UTF-8:
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding \"UTF8\" $MyPath
[System.IO.FileInfo] $file = Get-Item -Path $FilePath
$sequenceBOM = New-Object System.Byte[] 3
$reader = $file.OpenRead()
$bytesRead = $reader.Read($sequenceBOM, 0, 3)
$reader.Dispose()
#A UTF-8+BOM string will start with the three following bytes. Hex: 0xEF0xBB0xBF, Decimal: 239 187 191
if ($bytesRead -eq 3 -and $sequenceBOM[0] -eq 239 -and $sequenceBOM[1] -eq 187 -and $sequenceBOM[2] -eq 191)
{
$utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
[System.IO.File]::WriteAllLines($FilePath, (Get-Content $FilePath), $utf8NoBomEncoding)
Write-Host "Remove UTF-8 BOM successfully"
}
Else
{
Write-Warning "Not UTF-8 BOM file"
}
Source How to remove UTF8 Byte Order Mark (BOM) from a file using PowerShell
This script will convert, to UTF-8 without BOM, all .txt files in DIRECTORY1 and output them to DIRECTORY2
foreach ($i in ls -name DIRECTORY1\*.txt)
{
$file_content = Get-Content "DIRECTORY1\$i";
[System.IO.File]::WriteAllLines("DIRECTORY2\$i", $file_content);
}
This one works for me (use "Default" instead of "UTF8"):
$MyFile = Get-Content $MyPath
$MyFile | Out-File -Encoding "Default" $MyPath
The result is ASCII without BOM.
Change multiple files by extension to UTF-8 without BOM:
$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding($False)
foreach($i in ls -recurse -filter "*.java") {
$MyFile = Get-Content $i.fullname
[System.IO.File]::WriteAllLines($i.fullname, $MyFile, $Utf8NoBomEncoding)
}
I figured this wouldn't be UTF, but I just found a pretty simple solution that seems to work...
Get-Content path/to/file.ext | out-file -encoding ASCII targetFile.ext
For me this results in a utf-8 without bom file regardless of the source format.
When using Set-Content
instead of Out-File
, you can specify the encoding Byte
, which can be used to write a byte array to a file. This in combination with a custom UTF8 encoding which does not emit the BOM gives the desired result:
# This variable can be reused
$utf8 = New-Object System.Text.UTF8Encoding $false
$MyFile = Get-Content $MyPath -Raw
Set-Content -Value $utf8.GetBytes($MyFile) -Encoding Byte -Path $MyPath
The difference to using [IO.File]::WriteAllLines()
or similar is that it should work fine with any type of item and path, not only actual file paths.