问题
issue
- Outlook 2016 corrupted while I was moving items from an online archive into a pst file.
- The PST file has been recovered .... but many items (~7000) are duplicated 5 times
- There are a range of item types, standard messages, meeting requests etc
what I tried
I looked at existing solutions and tools, including:
- duplicate removal tools - none of which were free other than a trial option to remove 10 items at a time.
- A variety of code solutions including:
Jacob Hilderbrand's effort which runs from Excel
Macro in Outlook to delete duplicate emails-
I decided to go the code route as it was relatively simple and to gain more control over how the duplicates were reported.
I will post my self solution below as it may help others.
I would like to see other potential approaches (perhaps powershell) to fixing this problem which may be better than mine.
回答1:
The approach below:
- Provides users with a prompt to select the folder to process
- Checks duplicates on the base of Subject, Sender, CreationTime and Size
- Moved (rather than delete) any duplicates into a sub-folder (removed items) of the folder being processed.
- Create a CSV file - stored under the path in
StrPath
to create a external reference to Outlook of the emails that have been moved.
Updated: Checking for size surprisingly missed a number of dupes, even for otherwise identical mail items. I have changed the test to subject
and body
Tested on Outlook 2016
Const strPath = "c:\temp\deleted msg.csv"
Sub DeleteDuplicateEmails()
Dim lngCnt As Long
Dim objMail As Object
Dim objFSO As Object
Dim objTF As Object
Dim objDic As Object
Dim objItem As Object
Dim olApp As Outlook.Application
Dim olNS As NameSpace
Dim olFolder As Folder
Dim olFolder2 As Folder
Dim strCheck As String
Set objDic = CreateObject("scripting.dictionary")
Set objFSO = CreateObject("scripting.filesystemobject")
Set objTF = objFSO.CreateTextFile(strPath)
objTF.WriteLine "Subject"
Set olApp = Outlook.Application
Set olNS = olApp.GetNamespace("MAPI")
Set olFolder = olNS.PickFolder
If olFolder Is Nothing Then Exit Sub
On Error Resume Next
Set olFolder2 = olFolder.Folders("removed items")
On Error GoTo 0
If olFolder2 Is Nothing Then Set olFolder2 = olFolder.Folders.Add("removed items")
For lngCnt = olFolder.Items.Count To 1 Step -1
Set objItem = olFolder.Items(lngCnt)
strCheck = objItem.Subject & "," & objItem.Body & ","
strCheck = Replace(strCheck, ", ", Chr(32))
If objDic.Exists(strCheck) Then
objItem.Move olFolder2
objTF.WriteLine Replace(objItem.Subject, ", ", Chr(32))
Else
objDic.Add strCheck, True
End If
Next
If objTF.Line > 2 Then
MsgBox "duplicate items were removed to ""removed items""", vbCritical, "See " & strPath & " for details"
Else
MsgBox "No duplicates found"
End If
End Sub
回答2:
Here's a script that takes advantage of sorting emails to check for duplicates much more efficiently.
There's no need to maintain a giant dictionary of every email you've seen if you are processing emails in a deterministic order (e.g. received date). Once the date changes, you know you'll never see another email with the prior date, therefore, they won't be duplicates, so you can clear your dictionary on each date change.
This script also takes into account the fact that some items use an HTMLBody for the full message definition, and others don't have that property.
Sub DeleteDuplicateEmails()
Dim allMails As Outlook.Items
Dim objMail As Object, objDic As Object, objLastMail As Object
Dim olFolder As Folder, olDuplicatesFolder As Folder
Dim strCheck As String
Dim received As Date, lastReceived As Date
Set objDic = CreateObject("scripting.dictionary")
With Outlook.Application.GetNamespace("MAPI")
Set olFolder = .PickFolder
End With
If olFolder Is Nothing Then Exit Sub
On Error Resume Next
Set olDuplicatesFolder = olFolder.Folders("Duplicates")
On Error GoTo 0
If olDuplicatesFolder Is Nothing Then Set olDuplicatesFolder = olFolder.Folders.Add("Duplicates")
Debug.Print "Sorting " & olFolder.Name & " by ReceivedTime..."
Set allMails = olFolder.Items
allMails.Sort "[ReceivedTime]", True
Dim totalCount As Long, index As Long
totalCount = allMails.count
Debug.Print totalCount & " Items to Process..."
lastReceived = "1/1/1987"
For index = totalCount - 1 To 1 Step -1
Set objMail = allMails(index)
received = objMail.ReceivedTime
If received < lastReceived Then
Debug.Print "Error: Expected emails to be in order of date recieved. Previous mail was " & lastReceived _
& " current is " & received
Exit Sub
ElseIf received = lastReceived Then
' Might be a duplicate track mail contents until this recieved time changes.
' Add the last mail to the dictionary if it hasn't been tracked yet
If Not objLastMail Is Nothing Then
Debug.Print "Found multiple emais recieved at " & lastReceived & ", checking for duplicates..."
objDic.Add GetMailKey(objLastMail), True
End If
' Now check the current mail item to see if it's a duplicate
strCheck = GetMailKey(objMail)
If objDic.Exists(strCheck) Then
Debug.Print "Found Duplicate: """ & objMail.Subject & " on " & lastReceived
objMail.Move olDuplicatesFolder
DoEvents
Else
objDic.Add strCheck, True
End If
' No need to track the last mail, since we have it in the dictionary
Set objLastMail = Nothing
Else
' This can't be a duplicate, it has a different date, reset our dictionary
objDic.RemoveAll
lastReceived = received
' Keep track of this mail in case we end up needing to build a dictionary
Set objLastMail = objMail
End If
' Progress update
If index Mod 10 = 0 Then
Debug.Print index & " Remaining..."
End If
DoEvents
Next
Debug.Print "Finished moving Duplicate Emails"
End Sub
And the helper function referenced above for "uniquely identifying" an email. Adapt as needed, but I think if the subject and full body are the same, there's no point in checking anything else. Also works for calendar invites, etc.:
Function GetMailKey(ByRef objMail As Object) As String
On Error GoTo NoHTML
GetMailKey = objMail.Subject & objMail.HTMLBody
Exit Function
BodyKey:
On Error GoTo 0
GetMailKey = objMail.Subject & objMail.Body
Exit Function
NoHTML:
Err.Clear
Resume BodyKey
End Function
来源:https://stackoverflow.com/questions/34669174/remove-duplicate-outlook-items-from-a-folder