Replace CRLF using powershell

元气小坏坏 提交于 2019-11-26 20:17:20
Andrew Savinykh

You have not specified the version, I'm assuming you are using Powershell v3.

Try this:

$path = "C:\Users\abc\Desktop\File\abc.txt"
(Get-Content $path -Raw).Replace("`r`n","`n") | Set-Content $path -Force

Editor's note: As mike z points out in the comments, Set-Content appends a trailing CRLF, which is undesired. Verify with: 'hi' > t.txt; (Get-Content -Raw t.txt).Replace("`r`n","`n") | Set-Content t.txt; (Get-Content -Raw t.txt).EndsWith("`r`n"), which yields $True.

Note this loads the whole file in memory, so you might want a different solution if you want to process huge files.

UPDATE

This might work for v2 (sorry nowhere to test):

$in = "C:\Users\abc\Desktop\File\abc.txt"
$out = "C:\Users\abc\Desktop\File\abc-out.txt"
(Get-Content $in) -join "`n" > $out

Editor's note: Note that this solution (now) writes to a different file and is therefore not equivalent to the (still flawed) v3 solution. (A different file is targeted to avoid the pitfall Ansgar Wiechers points out in the comments: using > truncates the target file before execution begins). More importantly, though: this solution too appends a trailing CRLF, which is undesired. Verify with 'hi' > t.txt; (Get-Content t.txt) -join "`n" > t.NEW.txt; [io.file]::ReadAllText((Convert-Path t.NEW.txt)).endswith("`r`n"), which yields $True.

Same reservation about being loaded to memory though.

This is a state-of-the-union answer as of Windows PowerShell v5.1 / PowerShell Core v6.2.0:

  • Andrew Savinykh's ill-fated answer, despite being the accepted one, is, as of this writing, fundamentally flawed (I do hope it gets fixed - there's enough information in the comments - and in the edit history - to do so).

  • Ansgar Wiecher's helpful answer works well, but requires direct use of the .NET Framework (and reads the entire file into memory, though that could be changed). Direct use of the .NET Framework is not a problem per se, but is harder to master for novices and hard to remember in general.

  • A future version of PowerShell Core will have a
    Convert-TextFile cmdlet with a -LineEnding parameter to allow in-place updating of text files with a specific newline style, as being discussed on GitHub.

In PSv5+, PowerShell-native solutions are now possible, because Set-Content now supports the -NoNewline switch, which prevents undesired appending of a platform-native newline[1] :

# Convert CRLFs to LFs only.
# Note:
#  * (...) around Get-Content ensures that $file is read *in full*
#    up front, so that it is possible to write back the transformed content
#    to the same file.
#  * + "`n" ensures that the file has a *trailing LF*, which Unix platforms
#     expect.
((Get-Content $file) -join "`n") + "`n" | Set-Content -NoNewline $file

The above relies on Get-Content's ability to read a text file that uses any combination of CR-only, CRLF, and LF-only newlines line by line.

Caveats:

  • You need to specify the output encoding to match the input file's in order to recreate it with the same encoding. The command above does NOT specify an output encoding; to do so, use -Encoding; without -Encoding:

    • In Windows PowerShell, you'll get "ANSI" encoding, your system's single-byte, 8-bit legacy encoding, such as Windows-1252 on US-English systems.
    • In PowerShell Core, you'll get UTF-8 encoding without a BOM.
  • The input file's content as well as its transformed copy must fit into memory as a whole, which can be problematic with large input files.

  • There's a risk of file corruption, if the process of writing back to the input file gets interrupted.


[1] In fact, if there are multiple strings to write, -NoNewline also doesn't place a newline between them; in the case at hand, however, this is irrelevant, because only one string is written.

Alternative solution that won't append a spurious CR-LF:

$original_file ='C:\Users\abc\Desktop\File\abc.txt'
$text = [IO.File]::ReadAllText($original_file) -replace "`r`n", "`n"
[IO.File]::WriteAllText($original_file, $text)

Adding another version based on example above by @ricky89 and @mklement0 with few improvements:

Script to process:

  • *.txt files in the current folder
  • replace LF with CRLF (Unix to Windows line-endings)
  • save resulting files to CR-to-CRLF subfolder
  • tested on 100MB+ files, PS v5;

LF-to-CRLF.ps1:

# get current dir
$currentDirectory = Split-Path $MyInvocation.MyCommand.Path -Parent

# create subdir CR-to-CRLF for new files
$outDir = $(Join-Path $currentDirectory "CR-to-CRLF")
New-Item -ItemType Directory -Force -Path $outDir | Out-Null

# get all .txt files
Get-ChildItem $currentDirectory -Force | Where-Object {$_.extension -eq ".txt"} | ForEach-Object {
  $file = New-Object System.IO.StreamReader -Arg $_.FullName
  # Resulting file will be in CR-to-CRLF subdir
  $outstream = [System.IO.StreamWriter] $(Join-Path  $outDir $($_.BaseName + $_.Extension))
  $count = 0 
  # read line by line, replace CR with CRLF in each by saving it with $outstream.WriteLine
  while ($line = $file.ReadLine()) {
        $count += 1
        $outstream.WriteLine($line)
    }
  $file.close()
  $outstream.close()
  Write-Host ("$_`: " + $count + ' lines processed.')
}
ricky89

The following will be able to process very large files quickly.

$file = New-Object System.IO.StreamReader -Arg "file1.txt"
$outstream = [System.IO.StreamWriter] "file2.txt"
$count = 0 

while ($line = $file.ReadLine()) {
      $count += 1
      $s = $line -replace "`n", "`r`n"
      $outstream.WriteLine($s)
  }

$file.close()
$outstream.close()

Write-Host ([string] $count + ' lines have been processed.')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!