Python Script to detect broken images

允我心安 提交于 2020-11-29 10:23:09

问题


I wrote a python script to detect broken images and count them, The problem in my script is it detects all the images and does not detect broken images. How to fix this. I refered :

How to check if a file is a valid image file? for my code

My code

import os
from os import listdir
from PIL import Image
count=0
for filename in os.listdir('/Users/ajinkyabobade/Desktop/2'):
    if filename.endswith('.JPG'):
     try:
      img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)
      img.verify()
     except(IOError,SyntaxError)as e:
         print('Bad file  :  '+filename)
         count=count+1
         print(count)

回答1:


I have added another SO answer here that extends the PIL solution to better detect broken images. I also implemented this solution in my Python script here on GitHub.

I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.

I quote the other answer for completeness:

You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.

In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewer often load with a greyed area).

Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:

try:
  im = Image.load(filename)
  im.verify() #I perform also verify, don't know if he sees other types o defects
  im.close() #reload is necessary in my case
  im = Image.load(filename) 
  im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
  im.close()
except: 
  #manage excetions here

In case of image defects this code will raise an exception. Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations). With this code you are going to verify a set of images at about 10 MBytes/sec (modern 2.5Ghz x86_64 CPU).

For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:

im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()

But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.

I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.

I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:

statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
  #manage here the 'faulty image' case



回答2:


You are building a bad path with

img=Image.open('/Users/ajinkyabobade/Desktop/2'+filename)      

Try the following instead (by adding / to the end of the directory path)

img=Image.open('/Users/ajinkyabobade/Desktop/2/'+filename)      

or

img=Image.open(os.path.join('/Users/ajinkyabobade/Desktop/2', filename))



回答3:


try the below: It worked fine for me. It identifies the bad/corrupted image and remove them as well. Or if you want you can only print the bad/corrupted file name and remove the final script to delete the file.

for filename in listdir('/Users/ajinkyabobade/Desktop/2/'):
    if filename.endswith('.JPG'):
        try:
            img = Image.open('/Users/ajinkyabobade/Desktop/2/'+filename)  # open the image file
            img.verify()  # verify that it is, in fact an image
        except (IOError, SyntaxError) as e:
            print(filename)
            os.remove('/Users/ajinkyabobade/Desktop/2/'+filename)



回答4:


I am getting an error that tells me that Image.load is not available. Image.open appears to work.

I was also getting errors using:

except (IOError, SyntaxError) as e:

I just changed that to:

except:

and it worked fine.



来源:https://stackoverflow.com/questions/46854496/python-script-to-detect-broken-images

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!