Python Script to detect broken images

后端 未结 4 2066
深忆病人
深忆病人 2021-01-24 07:10

I wrote a python script to detect broken images and count them, The problem in my script is it detects all the images and does not detect broken images. How to fix this. I refe

4条回答
  •  借酒劲吻你
    2021-01-24 07:44

    I have added another SO answer here that extends the PIL solution to better detect broken images. I also implemented this solution in my Python script here on GitHub.

    I also verified that damaged files (jpg) frequently are not 'broken' images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it.

    I quote the other answer for completeness:

    You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.

    In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewer often load with a greyed area).

    Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:

    try:
      im = Image.load(filename)
      im.verify() #I perform also verify, don't know if he sees other types o defects
      im.close() #reload is necessary in my case
      im = Image.load(filename) 
      im.transpose(PIL.Image.FLIP_LEFT_RIGHT)
      im.close()
    except: 
      #manage excetions here
    

    In case of image defects this code will raise an exception. Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations). With this code you are going to verify a set of images at about 10 MBytes/sec (modern 2.5Ghz x86_64 CPU).

    For the other formats psd,xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:

    im = wand.image.Image(filename=filename)
    temp = im.flip;
    im.close()
    

    But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.

    I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.

    I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:

    statfile = os.stat(filename)
    filesize = statfile.st_size
    if filesize == 0:
      #manage here the 'faulty image' case
    

提交回复
热议问题