问题
#!/usr/bin/env python2.7
import vobject
abfile='/foo/bar/directory/file.vcf' #ab stands for address book
ablist = []
with open(abfile) as source_file:
for vcard in vobject.readComponents(source_file):
ablist.append(vcard)
print ablist[0]==ablist[1]
The above code should return True but it does not because the vcards are considered different even though they are the same. One of the ultimate objectives is to find a way to remove duplicates from the vcard file. Bonus points: Is there a way to make the comparison compatible with using one of the fast ways to uniqify a list in Python such as:
set(ablist)
to remove duplicates? (e.g. convert the vcards to strings somehow...). In the code above len(set(ablist)) returns 2 and not 1 as expected...
In contrast, if instead of comparing the whole vcard we compare one component of it as in:
print ablist[0].fn==ablist[1].fn
then we do see the expected behavior and receive True as response...
Here is the file contents used in the test (with only two identical vcards):
BEGIN:VCARD
VERSION:3.0
FN:Foo_bar1
N:;Foo_bar1;;;
EMAIL;TYPE=INTERNET:foobar1@foo.bar.com
END:VCARD
BEGIN:VCARD
VERSION:3.0
FN:Foo_bar1
N:;Foo_bar1;;;
EMAIL;TYPE=INTERNET:foobar1@foo.bar.com
END:VCARD
回答1:
@Brian Barcelona, concerning your answer, just to let you know, instead of:
ablist = []
with open(abfile) as source_file:
for vcard in vobject.readComponents(source_file):
ablist.append(vcard)
You could do:
with open(abfile) as source_file:
ablist = list(vobject.readComponents(source_file))
By the way, I have looked in the source code of this module and your solution is not guaranteed to work because different components of a vcard could be the same but not in the same order. I think the best way is for you to check each relevant component yourself.
回答2:
I have found the following will work - the insight is to "serialize()" the vcard:
#!/usr/bin/env python2.7
import vobject
abfile='/foo/bar/directory/file.vcf' #ab stands for address book
ablist = []
with open(abfile) as source_file:
for vcard in vobject.readComponents(source_file):
ablist.append(vcard)
print ablist[0].serialize()==ablist[1].serialize()
However, there should be a better way to do this... any help would be most welcomed!
来源:https://stackoverflow.com/questions/41460013/to-remove-vcard-contact-duplicates-comparing-if-two-vcards-are-equal-in-vcf-fi