Piping Text To An External Program Appends A Trailing Newline

馋奶兔 提交于 2020-05-11 06:29:05

问题


I have been comparing hash values between multiple systems and was surprised to find that PowerShells hash values are different than that of other terminals.

Linux terminals (CygWin, Bash for Windows, etc.) and Windows Command Prompt are all showing the same hash where as PowerShell is showing a different hash value.

This was tested using SHA256 but found the same issue when using other algorithms like md5.

Encoding Update:

Tried changing the PShell encoding but it did not have any effect on the returned hash values.

[Console]::OutputEncoding.BodyName 
iso-8859-1
[Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8
utf-8

GitHub PowerShell Issue

https://github.com/PowerShell/PowerShell/issues/5974


回答1:


tl;dr:

The key is to avoid PowerShell's pipeline in favor of the native shell's, so as to prevent implicit addition of a trailing newline:

  • If you're running your command on a Unix-like platform (using PowerShell Core):
sh -c "printf %s 'string' | openssl dgst -sha256 -hmac authcode"

printf %s is the portable alternative to echo -n. If the string contains ' chars., double them or use `"...`" quoting instead.

  • In case you need to do this on Windows via cmd.exe, things get even trickier, because cmd.exe doesn't directly support echoing without a trailing newline:
cmd /c "<NUL set /p =`"string`"| openssl dgst -sha256 -hmac authcode"

Note that there must be no space before | for this to work. For an explanation and the limitations of this solution, see this answer of mine.

Encoding issues would only arise if the string contained non-ASCII characters and you're running in Windows PowerShell; in that event, first set $OutputEncoding to the encoding that the target utility expects, typically UTF-8: $OutputEncoding = [Text.Utf8Encoding]::new()


  • PowerShell, as of Windows PowerShell v5.1 / PowerShell Core v6.0.0, invariably appends a trailing newline when you send a string without one via the pipeline to an external utility, which is the reason for the difference you're observing (that trailing newline will be a LF only on Unix platforms, and a CRLF sequence on Windows).

    • You can keep track of efforts to address this problem in this GitHub issue opened by the OP.
  • Additionally, PowerShell's pipeline is invariably text-based when it comes to piping data to external programs; the internally UTF-16LE-based PowerShell (.NET) strings are transcoded based on the encoding stored in the automatic $OutputEncoding variable, which defaults to ASCII-only encoding in Windows PowerShell, and to UTF-8 encoding in PowerShell Core (both on Windows and on Unix-like platforms).

    • In PowerShell Core, a change is being discussed for piping raw byte streams between external programs.
  • The fact that echo -n in PowerShell does not produce a string without a trailing newline is therefore incidental to your problem; for the sake of completeness, here's an explanation:

    • echo is an alias for PowerShell's Write-Output cmdlet, which - in the context of piping to external programs - writes text to the standard input of the program in the next pipeline segment (similar to Bash / cmd.exe's echo).
    • -n is interpreted as an (unambiguous) abbreviation for Write-Output's -NoEnumerate switch.
    • -NoEnumerate only applies when writing multiple objects, so it has no effect here.
    • Therefore, in short: in PowerShell, echo -n "string" is the same as Write-Output -NoEnumerate "string", which - because only a single string is output - is the same as Write-Output "string", which, in turn, is the same as just using "string", relying on PowerShell's implicit output behavior.
    • Write-Output has no option to suppress a trailing newline, and even if it did, using a pipeline to pipe to an external program would add it back in.



回答2:


Linux terminals and PowerShell use different encodings. So real bytes produced by echo -n "string" are different. I tried it on my Linux Mint terminal and Windows 10 PowerShell. Here what I got:

Linux Mint:

73 74 72 69 6E 67

Windows 10:

FF FE 73 00 74 00 72 00 69 00 6E 00 67 00 0D 00 0A 00

It seems that Linux terminals use UTF-8 and Windows PowerShell uses UTF-16 with a BOM. Also in PowerShell you cannot use '-n' parameter for echo. So echo places newline characters \r\n (0D 00 0A 00) at the end of the "string".

Edit: As mklement0 said below, Windows PowerShell uses ASCII by default when piping.



来源:https://stackoverflow.com/questions/48371447/piping-text-to-an-external-program-appends-a-trailing-newline

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!