问题
As follow-up, suggested by Doug, on my previous question on anonymizing file ( PowerShell - Find and replace multiple patterns to anonymize file) I need to save all hash tables values in single file "tmp.txt" for further processing. Example: after processing the input file with string like:
<requestId>>qwerty-qwer12-qwer56</requestId>
the tmp.txt file contains:
qwerty-qwer12-qwer56 : RequestId-1
and this is perfect. The problem is when working with many strings, in the tmp.txt file there are more pairs than there should be. In my example below in tmp.txt I should see 4 times the "RequestId-x" but there are 6. Also when there are 2 or more "match" on the same line, only the first is updated/replaced. Any idea from where these extra lines comes from? Any why the script doesn't continue to check till the end of the same line?
Here is my test code:
$log = "C:\log.txt"
$tmp = "C:\tmp.txt"
Clear-Content $log
Clear-Content $tmp
@'
<requestId>qwerty-qwer12-qwer56</requestId>qwertykeyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</ABC reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
<requestId>zxcvbn-zxcv12-zxcv56</requestId>
<requestId>qwerty-qwer12-qwer56</requestId>abcde reportId>plmkjh8765FGH4rt6As</msg:reportId>
<requestId>1234qw-12qw12-12qw56</requestId>
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5</
keyId>Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw</
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>plmkjh8765FGH4rt6As</msg:reportId>
reportId>poGd56Hnm9q3Dfer6Jh</msg:reportId>
'@ | Set-Content $log -Encoding UTF8
$requestId = @{
Count = 1
Matches = @()
}
$keyId = @{
Count = 1
Matches = @()
}
$reportId = @{
Count = 1
Matches = @()
}
$output = switch -Regex -File $log {
'(\w{6}-\w{6}-\w{6})' {
if(!$requestId.matches.($matches.1))
{
$req = $requestId.matches += @{$matches.1 = "RequestId-$($requestId.count)"}
$requestId.count++
$req.keys | %{ Add-Content $tmp "$_ : $($req.$_)" }
}
$_ -replace $matches.1,$requestId.matches.($matches.1)
}
'keyId>(\w{70})</' {
if(!$keyId.matches.($matches.1))
{
$kid = $keyId.matches += @{$matches.1 = "keyId-$($keyId.count)"}
$keyId.count++
$kid.keys | %{ Add-Content $tmp "$_ : $($kid.$_)" }
}
$_ -replace $matches.1,$keyId.matches.($matches.1)
}
'reportId>(\w{19})</msg:reportId>' {
if(!$reportId.matches.($matches.1))
{
$repid = $reportId.matches += @{$matches.1 = "Report-$($reportId.count)"}
$reportId.count++
$repid.keys | %{ Add-Content $tmp "$_ : $($repid.$_)" }
}
$_ -replace $matches.1,$reportId.matches.($matches.1)
}
default {$_}
}
$output | Set-Content $log -Encoding UTF8
Get-Content $log
Get-Content $tmp
回答1:
If you don't care about the order in which they were found, which I assume you wouldn't if you don't want duplicates, just export them all at the end. I would still keep them in an "object" form so you can easily import/export them. Csv would be an ideal candidate for the data.
$requestId,$keyid,$reportid | Foreach-Object {
foreach($key in $_.matches.keys)
{
[PSCustomObject]@{
Original = $key
Replacement = $_.matches.$key
}
}
}
The data output to console for this example
Original Replacement
-------- -----------
qwerty-qwer12-qwer56 RequestId-1
zxcvbn-zxcv12-zxcv56 RequestId-2
1234qw-12qw12-12qw56 RequestId-3
Qwd84lPhjutf7Nmwr56hJndcsjy34imNQwd84lPhjutZ7Nmwr56hJndcsjy34imNPozDr5 keyId-1
Zdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdZdjgi76Gho3sQw0ib5Mjk3sDyoq9zmGdLkJpQw keyId-2
poGd56Hnm9q3Dfer6Jh Report-1
plmkjh8765FGH4rt6As Report-2
Just pipe it into Export-Csv
$requestId,$keyid,$reportid | Foreach-Object {
foreach($key in $_.matches.keys)
{
[PSCustomObject]@{
Original = $key
Replacement = $_.matches.$key
}
}
} | Export-Csv $tmp -NoTypeInformation
来源:https://stackoverflow.com/questions/64901869/powershell-store-hash-table-in-file-and-read-its-content