问题
I am trying to find a Powershell command line that will read in a text file remove all duplicated lines (2+) and retain none of the duplicated lines. I haven't been able to find an answer for my question anywhere on Stackoverflow nor anywhere else. Every example I have found so far only shows removing one and/or many of the duplicated lines and retaining one.
Is this possible through Powershell 2.0?
PowerShell Command Example:
Get-Content "C:\Temp\OriginalFile.txt" | select -unique| Out-File "C:\Temp\ResultFile.txt"
OriginalFile.txt
1
1
1
2
2
3
4
ResultFile.txt (Actual)
1
2
3
4
ResultsFile.txt (Desired)
3
4
回答1:
PSv2:
$f = 'C:\Temp\OriginalFile.txt'
Get-Content $f | Group-Object | ? { $_.Count -eq 1 } | Select-Object -ExpandProperty Name
PSv3+ allows for a cleaner and more concise solution:
Get-Content $f | Group-Object | ? Count -eq 1 | % Name
For brevity, the commands use built-in aliases ?
(for Where-Object
) and %
(for ForEach-Object
).
Neither Select-Object -Unique
nor Get-Unique
seemingly allow restricting the output to singletons in the input (standard Unix utility uniq
has such a feature built in: uniq -u
), so a different approach is needed.
The above Group-Object
-based solution may not be efficient, but it is convenient:
lines are grouped by identical content, yielding objects that represent each group.
? { $_.Count -eq 1 }
the filters the groups down to those that have just 1 member, in effect weeding out all duplicate lines.Select-Object -ExpandProperty Name
then transforms the filtered group objects back to the input line they represent.
来源:https://stackoverflow.com/questions/41833926/powershell-removing-all-duplicate-entries