Powershell - Removing all duplicate entries

送分小仙女□ 提交于 2020-01-03 02:21:15

问题


I am trying to find a Powershell command line that will read in a text file remove all duplicated lines (2+) and retain none of the duplicated lines. I haven't been able to find an answer for my question anywhere on Stackoverflow nor anywhere else. Every example I have found so far only shows removing one and/or many of the duplicated lines and retaining one.

Is this possible through Powershell 2.0?

PowerShell Command Example:

Get-Content "C:\Temp\OriginalFile.txt" | select  -unique| Out-File "C:\Temp\ResultFile.txt"

OriginalFile.txt

1
1
1
2
2
3
4

ResultFile.txt (Actual)

1
2
3
4

ResultsFile.txt (Desired)

3
4

回答1:


PSv2:

$f = 'C:\Temp\OriginalFile.txt'

Get-Content $f | Group-Object | ? { $_.Count -eq 1 } | Select-Object -ExpandProperty Name

PSv3+ allows for a cleaner and more concise solution:

Get-Content $f | Group-Object | ? Count -eq 1 | % Name

For brevity, the commands use built-in aliases ? (for Where-Object) and % (for ForEach-Object).

Neither Select-Object -Unique nor Get-Unique seemingly allow restricting the output to singletons in the input (standard Unix utility uniq has such a feature built in: uniq -u), so a different approach is needed.

The above Group-Object-based solution may not be efficient, but it is convenient:

  • lines are grouped by identical content, yielding objects that represent each group.

  • ? { $_.Count -eq 1 } the filters the groups down to those that have just 1 member, in effect weeding out all duplicate lines.

  • Select-Object -ExpandProperty Name then transforms the filtered group objects back to the input line they represent.



来源:https://stackoverflow.com/questions/41833926/powershell-removing-all-duplicate-entries

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!