compare two lists of files, ignoring file extension in one list

后端 未结 4 1490
我在风中等你
我在风中等你 2021-01-20 21:59

I have two lists

list1 = [\'image1.png\', \'image2.png\', \'image3.png\', \'image3.png\']
list2 = [\'image1.pdf\', \'image2.eps\', \'image3.ps\']

相关标签:
4条回答
  • 2021-01-20 22:09
    def filename(name):
        return name.split('.')[0]
    
    list2_filenames = [filename(name) for name in list2]
    found_filenames = [name for name in list1 if filename(name) in list2_filenames] 
    
    0 讨论(0)
  • 2021-01-20 22:10
    from os.path import splitext
    
    list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png', 'image4.png', 'image3.jpg']
    list2 = ['image1.pdf', 'image2.eps', 'image3.ps', 'image5.doc']
    
    # Create a lookup set of the document names sans extensions.
    documents = set([splitext(filename)[0] for filename in list2])
    
    # Compare each stripped filename in list1 to the list of stripped document filenames.
    matches = [filename for filename in set(list1) if splitext(filename)[0] in documents]
    
    print matches
    

    Output:

    ['image1.png', 'image2.png', 'image3.png', 'image3.jpg']
    

    Note that it would have to be adapted for files with multiple extensions like .tar.gz if needed (filename.partition(".")[0] would do the trick). But that would mean that dots cannot be put anywhere in the filename because the first dot now delimits the extension.

    0 讨论(0)
  • 2021-01-20 22:10

    You can try using set to get uniques and a list comprehension to do the comparison:

    from os.path import splitext
    
    list1 = ['image1.png', 'image2.png', 'image3.png', 'image3.png']
    list2 = ['image1.pdf', 'image2.eps', 'image3.ps']
    reference = set([splittext(item)[0] for item in list2]) #  Strip the extension
    outcome = set([item for item in list1 if splittext(item)[0] in reference]) #  compare
    print(outcome)
    >>> 
    {'image3.png', 'image2.png', 'image1.png'}
    
    0 讨论(0)
  • 2021-01-20 22:13

    Use a list comprehension with set:

    list1 = ["image1.png", "image2.png", "image3.png", "image3.png"]
    list2 = ["image1.pdf", "image2.eps", "image3.ps"]
    
    print [x for x in set(list1) for y in set(list2) if x.split('.')[0] == y.split('.')[0]]
    

    Output:

    ['image1.png', 'image2.png', 'image3.png']
    
    0 讨论(0)
提交回复
热议问题