Removing duplicate files with Powershell

后端未结

关注

 4  2070

渐次进展 2021-02-14 21:39

I have several thousand duplicate files (jar files as an example) that I\'d like to use powershell to

Search through the file system recursively
Find the

4条回答

隐瞒了意图╮ (楼主)

2021-02-14 21:57
Even though the question is old, I have been in a need to clean up all duplicate files based on content. The idea is simple, the algorithm for this is not straightforward. Here is the code which accepts a parameter of "path" to delete duplicates from.
```
 Function Delete-Duplicates {
    param(
    [Parameter(
    Mandatory=$True,
    ValueFromPipeline=$True,
    ValueFromPipelineByPropertyName=$True
    )]
    [string[]]$PathDuplicates)

    $DuplicatePaths = 
        Get-ChildItem $PathDuplicates | 
        Get-FileHash |
        Group-Object -Property Hash |
        Where-Object -Property Count -gt 1 |
        ForEach-Object {
            $_.Group.Path |
            Select -First ($_.Count -1)}
    $TotalCount = (Get-ChildItem $PathDuplicates).Count
 Write-Warning ("You are going to delete {0} files out of {1} total. Please confirm the prompt" -f $DuplicatePaths.Count, $TotalCount)    
 $DuplicatePaths | Remove-Item -Confirm

    }
```
The script

a) Lists all ChildItems

b) Retrieves FileHash from them

c) Groups them by Hash Property (so all the same files are in the single group)

d) Filters out the already-unique files (count of group -eq 1)

e) Loops through each group and lists all but last paths - ensuring one file of each "Hash" always stays

f) Warns before preceding, saying how many files are there in total and how many are going to be deleted.

Probably not the most performance-wise option (SHA1-ing every file) but ensures the file is a duplicate. Works perfectly fine for me :)
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...