compare two lists of files, ignoring file extension in one list

后端未结

关注

 4  1495

I have two lists

list1 = [\'image1.png\', \'image2.png\', \'image3.png\', \'image3.png\']
list2 = [\'image1.pdf\', \'image2.eps\', \'image3.ps\']

相关标签:

4条回答

无人及你

2021-01-20 22:09

def filename(name):
    return name.split('.')[0]

list2_filenames = [filename(name) for name in list2]
found_filenames = [name for name in list1 if filename(name) in list2_filenames]

0 讨论(0)

借酒劲吻你

2021-01-20 22:10

from os.path import splitext

list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png', 'image4.png', 'image3.jpg']
list2 = ['image1.pdf', 'image2.eps', 'image3.ps', 'image5.doc']

# Create a lookup set of the document names sans extensions.
documents = set([splitext(filename)[0] for filename in list2])

# Compare each stripped filename in list1 to the list of stripped document filenames.
matches = [filename for filename in set(list1) if splitext(filename)[0] in documents]

print matches

Output:

['image1.png', 'image2.png', 'image3.png', 'image3.jpg']

Note that it would have to be adapted for files with multiple extensions like .tar.gz if needed (filename.partition(".")[0] would do the trick). But that would mean that dots cannot be put anywhere in the filename because the first dot now delimits the extension.

0 讨论(0)

执笔经年

2021-01-20 22:10

You can try using set to get uniques and a list comprehension to do the comparison:

from os.path import splitext

list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png']
list2 = ['image1.pdf', 'image2.eps', 'image3.ps']
reference = set([splittext(item)[0] for item in list2]) #  Strip the extension
outcome = set([item for item in list1 if splittext(item)[0] in reference]) #  compare
print(outcome)
>>> 
{'image3.png', 'image2.png', 'image1.png'}

0 讨论(0)

感情败类

2021-01-20 22:13

Use a list comprehension with set:

list1 = ["image1.png", "image2.png", "image3.png", "image3.png"]
list2 = ["image1.pdf", "image2.eps", "image3.ps"]

print [x for x in set(list1) for y in set(list2) if x.split('.')[0] == y.split('.')[0]]

Output:

['image1.png', 'image2.png', 'image3.png']

0 讨论(0)