问题
I am creating a script that will copy a file, rename it and then look inside to remove certain special characters. One of these special characters is some sort of ASCII apostrophe that I cannot replicate with keys. I can copy and paste it though, however the replace function doesn't work.
Opens file > Searches for strange apostrophe ’ and replaces with nothing. I'd like it to replace it with a normal apostrophe but I don't know how this is done, and at current the biggest problem is that I can't get it to "see" this strange apostrophe that winds up in the autogenerated file I'm modifying. Any help much appreciated. Thanks :)
Apostrophe in file: ’
Normal Apostrophe: '
This is a chunk of the batch that I've isolated to test with.
@echo off
set YYMMDD=%DATE:~-2,2%%DATE:~-7,2%%DATE:~-10,2%
set DDMMYYYY=%DATE:~-10,2%%DATE:~-7,2%%DATE:~-4,4%
set YYYY-MM-DD=%DATE:~-4,4%-%DATE:~-7,2%-%DATE:~-10,2%
powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv') -replace '’', '' | Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"
Echo Done
回答1:
set "fileIn=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
set "fileOu=C:\LOCATION\Client_List_%DDMMYYYY%.csv"
powershell -c "(gc '%fileIn%').Replace('‘‘','').Replace('’’','')|Out-File '%fileOu%'"
That strange apostrophe ’
is U+2019
Right Single Quotation Mark, supposedly a closing quote. It could be paired with a different opening quote. In above example, ‘
is U+2018
Left Single Quotation Mark.
Get-Help 'about_Quoting_Rules' says
Quotation marks are used to specify a literal string. You can enclose a string in single quotation marks (
'
) or double quotation marks ("
).
In fact, PowerShell accepts two different sets of quotes:
- double quotation marks
"
“
”
„
- single quotation marks
'
‘
’
‚
‛
AFAIK, all those quotation marks are present in most Windows ANSI code pages (1252, 1250, 1257, 1253, 1251, 1254, 1255, 1256, 1258) so they may be used literally in ANSI
-saved .bat
script - except the latter quotation mark ‛
U+201B
Single High-Reversed-9 Quotation Mark. In such case, use $([char]0x201B)
instead of '‛‛'
as follows:
rem cast [char] to `[string]` ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( [string]$([char]0x201B) , '')"
rem ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
or as follows:
rem [char] can't be empty so specify `[string]` ↓↓↓↓↓↓↓↓
powershell -c "(gc '%fileIn%').Replace( $([char]0x201B) , [string]'')"
rem ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑
Analysis and explanation
Next PowerShell code snippet shows an excerpt from Unicode database (character names ending with Quotation Mark
or containing Apostrophe
):
PS D:> 0x22,0x27,0x00AB,0x00BB,0x2018,0x2019,0x201A,0x201B,0x201C,0x201D,0x201E,0x201F,
0x2039,0x203A,0x2E42,0x301D,0x301E,0x301F,0x055A | Get-CharInfo | Format-Table -AutoSize
Char CodePoint Category Description
---- --------- -------- -----------
" U+0022 OtherPunctuation Quotation Mark
' U+0027 OtherPunctuation Apostrophe
« U+00AB InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark
» U+00BB FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark
‘ U+2018 InitialQuotePunctuation Left Single Quotation Mark
’ U+2019 FinalQuotePunctuation Right Single Quotation Mark
‚ U+201A OpenPunctuation Single Low-9 Quotation Mark
‛ U+201B InitialQuotePunctuation Single High-Reversed-9 Quotation Mark
“ U+201C InitialQuotePunctuation Left Double Quotation Mark
” U+201D FinalQuotePunctuation Right Double Quotation Mark
„ U+201E OpenPunctuation Double Low-9 Quotation Mark
‟ U+201F InitialQuotePunctuation Double High-Reversed-9 Quotation Mark
‹ U+2039 InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark
› U+203A FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark
⹂ U+2E42 OtherNotAssigned Undefined
〝 U+301D OpenPunctuation Reversed Double Prime Quotation Mark
〞 U+301E ClosePunctuation Double Prime Quotation Mark
〟 U+301F ClosePunctuation Low Double Prime Quotation Mark
՚ U+055A OtherPunctuation Armenian Apostrophe
(Output from modified Get-CharInfo
cmdlet.) Original Get-CharInfo
module is downloadable from http://poshcode.org/5234.
Next PowerShell script completes above results by showing some valid (and invalid in my locale) combinations of quotes:
$arrSingleQuotes =
''' U+0027 Apostrophe ''' ,
‘‘‘ U+2018 Left Single Quotation Mark ‘‘‘ ,
’’’ U+2019 Right Single Quotation Mark ’’’ ,
‚‚‚ U+201A Single Low-9 Quotation Mark ‚‚‚ ,
‛‛‛ U+201B Single High-Reversed-9 Quotation Mark ‛‛‛ ,
‘‘‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’’’ ,
’’’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘‘‘
'$arrSingleQuotes (any combination)'
$arrSingleQuotes
$arrDoubleQoutes =
""" U+0022 Quotation Mark """ ,
“““ U+201C Left Double Quotation Mark “““ ,
””” U+201D Right Double Quotation Mark ””” ,
„„„ U+201E Double Low-9 Quotation Mark „„„ ,
“““ U+201C (Left/Right) Double Quotation Mark U+201D ””” ,
””” U+201D (Right/Left) Double Quotation Mark U+201C “““
'$arrDoubleQoutes (any combination)'
$arrDoubleQoutes
$noQuotes = @"
« U+00AB Left-Pointing Double Angle Quotation Mark
» U+00BB Right-Pointing Double Angle Quotation Mark
‟ U+201F Double High-Reversed-9 Quotation Mark
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
‹ U+2039 Single Left-Pointing Angle Quotation Mark
› U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
〞U+301E Double Prime Quotation Mark
〟U+301F Low Double Prime Quotation Mark
՚ U+055A Armenian Apostrophe
"@
'$noQuotes'
$noQuotes
Output:
PS D:> D:\PShell\SO\41488245_quotes.ps1
$arrSingleQuotes (any combination)
' U+0027 Apostrophe '
‘ U+2018 Left Single Quotation Mark ‘
’ U+2019 Right Single Quotation Mark ’
‚ U+201A Single Low-9 Quotation Mark ‚
‛ U+201B Single High-Reversed-9 Quotation Mark ‛
‘ U+2018 (Left/Right) Single Quotation Mark U+2019 ’
’ U+2019 (Right/Left) Single Quotation Mark U+2018 ‘
$arrDoubleQoutes (any combination)
" U+0022 Quotation Mark "
“ U+201C Left Double Quotation Mark “
” U+201D Right Double Quotation Mark ”
„ U+201E Double Low-9 Quotation Mark „
“ U+201C (Left/Right) Double Quotation Mark U+201D ”
” U+201D (Right/Left) Double Quotation Mark U+201C “
$noQuotes
« U+00AB Left-Pointing Double Angle Quotation Mark
» U+00BB Right-Pointing Double Angle Quotation Mark
‟ U+201F Double High-Reversed-9 Quotation Mark
⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
‹ U+2039 Single Left-Pointing Angle Quotation Mark
› U+203A Single Right-Pointing Angle Quotation Mark
〝 U+301D Reversed Double Prime Quotation Mark
〞U+301E Double Prime Quotation Mark
〟U+301F Low Double Prime Quotation Mark
՚ U+055A Armenian Apostrophe
Note that ⹂ U+2E42 DOUBLE LOW-REVERSED-9 QUOTATION MARK
is present in Unicode database and is properly rendered in PowerShell ISE.
Addendum: I found more candidates of quotation marks (shown merely result obtained from Excerpt_From_UnicodeDataTxt.ps1
script):
PS > $x = .\tests\Excerpt_From_UnicodeDataTxt.ps1 -SearchString "Quotation|Apostrophe" |
Where-Object {$_.Category -match 'Punctuation'}
PS > $x.Count
23
PS > $x
Char CodePoint Category Description
---- --------- -------- -----------
" U+0022 Po-OtherPunctuation Quotation Mark
' U+0027 Po-OtherPunctuation Apostrophe
« U+00AB Pi-InitialQuotePunctuation Left-Pointing Double Angle Quotation Mark
» U+00BB Pf-FinalQuotePunctuation Right-Pointing Double Angle Quotation Mark
՚ U+055A Po-OtherPunctuation Armenian Apostrophe
‘ U+2018 Pi-InitialQuotePunctuation Left Single Quotation Mark
’ U+2019 Pf-FinalQuotePunctuation Right Single Quotation Mark
‚ U+201A Ps-OpenPunctuation Single Low-9 Quotation Mark
‛ U+201B Pi-InitialQuotePunctuation Single High-Reversed-9 Quotation Mark
“ U+201C Pi-InitialQuotePunctuation Left Double Quotation Mark
” U+201D Pf-FinalQuotePunctuation Right Double Quotation Mark
„ U+201E Ps-OpenPunctuation Double Low-9 Quotation Mark
‟ U+201F Pi-InitialQuotePunctuation Double High-Reversed-9 Quotation Mark
‹ U+2039 Pi-InitialQuotePunctuation Single Left-Pointing Angle Quotation Mark
› U+203A Pf-FinalQuotePunctuation Single Right-Pointing Angle Quotation Mark
❮ U+276E Ps-OpenPunctuation Heavy Left-Pointing Angle Quotation Mark Ornament
❯ U+276F Pe-ClosePunctuation Heavy Right-Pointing Angle Quotation Mark Ornament
⹂ U+2E42 Ps-OpenPunctuation Undefined
〝 U+301D Ps-OpenPunctuation Reversed Double Prime Quotation Mark
〞 U+301E Pe-ClosePunctuation Double Prime Quotation Mark
〟 U+301F Pe-ClosePunctuation Low Double Prime Quotation Mark
" U+FF02 Po-OtherPunctuation Fullwidth Quotation Mark
' U+FF07 Po-OtherPunctuation Fullwidth Apostrophe
回答2:
I think it's a weird backtick character. At least that's what it's acting like.
If I do this:
$text = "Weird ’ Normal ' Backtick ` Weird ’ "
$text.Replace("’","")
It gives me This:
Weird Normal ' Backtick Weird
So does this work?
powershell -Command "(gc 'C:\LOCATION\Client_List_%DDMMYYYY%.csv').replace('’’', '') |
Out-File 'C:\LOCATION\Client_List_%DDMMYYYY%.csv'"
By doubling a normal back tick, it makes the script take the character literally. Doubling the weird apostrophe seems to do the same thing, at least in my testing that works.
来源:https://stackoverflow.com/questions/41488245/cmd-to-powershell-replace-special-character