Python Script to detect broken images

后端 未结 4 2062
深忆病人
深忆病人 2021-01-24 07:10

I wrote a python script to detect broken images and count them, The problem in my script is it detects all the images and does not detect broken images. How to fix this. I refe

相关标签:
4条回答
  • 2021-01-24 07:41

    You are building a bad path with

    img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)      
    

    Try the following instead (by adding / to the end of the directory path)

    img=Image.open('/Users/ajinkyabobade/Desktop/2/'+filename)      
    

    or

    img=Image.open(os.path.join('/Users/ajinkyabobade/Desktop/2', filename))
    
    0 讨论(0)
  • 2021-01-24 07:44

    I have added another SO answer here that extends the PIL solution to better detect broken images. I also implemented this solution in my Python script here on GitHub.

    I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.

    I quote the other answer for completeness:

    You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.

    In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewer often load with a greyed area).

    Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:

    try:
      im = Image.load(filename)
      im.verify() #I perform also verify, don't know if he sees other types o defects
      im.close() #reload is necessary in my case
      im = Image.load(filename) 
      im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
      im.close()
    except: 
      #manage excetions here
    

    In case of image defects this code will raise an exception. Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations). With this code you are going to verify a set of images at about 10 MBytes/sec (modern 2.5Ghz x86_64 CPU).

    For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:

    im = wand.image.Image(filename=filename)
    temp = im.flip;
    im.close()
    

    But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.

    I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.

    I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:

    statfile = os.stat(filename)
    filesize = statfile.st_size
    if filesize == 0:
      #manage here the 'faulty image' case
    
    0 讨论(0)
  • 2021-01-24 07:55

    I am getting an error that tells me that Image.load is not available. Image.open appears to work.

    I was also getting errors using:

    except (IOError, SyntaxError) as e:
    

    I just changed that to:

    except:
    

    and it worked fine.

    0 讨论(0)
  • 2021-01-24 07:56

    try the below: It worked fine for me. It identifies the bad/corrupted image and remove them as well. Or if you want you can only print the bad/corrupted file name and remove the final script to delete the file.

    for filename in listdir('/Users/ajinkyabobade/Desktop/2/'):
        if filename.endswith('.JPG'):
            try:
                img = Image.open('/Users/ajinkyabobade/Desktop/2/'+filename)  # open the image file
                img.verify()  # verify that it is, in fact an image
            except (IOError, SyntaxError) as e:
                print(filename)
                os.remove('/Users/ajinkyabobade/Desktop/2/'+filename)
    
    0 讨论(0)
提交回复
热议问题