PIL cannot identify image file for io.BytesIO object

I am using the Pillow fork of PIL and keep receiving the error

OSError: cannot identify image file <_io.BytesIO object at 0x103a47468>

when trying to open an image. I am using virtualenv with python 3.4 and no installation of PIL.

I have tried to find a solution to this based on others encountering the same problem, however, those solutions did not work for me. Here is my code:

from PIL import Image
import io

# This portion is part of my test code
byteImg = Image.open("some/location/to/a/file/in/my/directories.png").tobytes()

# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO) # <- Error here

The image exists in the initial opening of the file and it gets converted to bytes. This appears to work for almost everyone else but I can't figure out why it fails for me.

EDIT:

dataBytesIO.seek(0)

does not work as a solution (tried it) since I'm not saving the image via a stream, I'm just instantiating the BytesIO with data, therefore (if I'm thinking of this correctly) seek should already be at 0.

(This solution is from the author himself. I have just moved it here.)

SOLUTION:

# This portion is part of my test code
byteImgIO = io.BytesIO()
byteImg = Image.open("some/location/to/a/file/in/my/directories.png")
byteImg.save(byteImgIO, "PNG")
byteImgIO.seek(0)
byteImg = byteImgIO.read()


# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO)

The problem was with the way that Image.tobytes()was returning the byte object. It appeared to be invalid data and the 'encoding' couldn't be anything other than raw which still appeared to output wrong data since almost every byte appeared in the format \xff\. However, saving the bytes via BytesIO and using the .read() function to read the entire image gave the correct bytes that when needed later could actually be used.

It happened to me when accidentally loading a PDF instead of PNG.

While reading Dicom files the problem might be caused due to Dicom compression. Make sure both gdcm and pydicom are installed.

GDCM is usually the one thats more difficult to install. The latest way to easily install the same is

conda install -U conda-forge gdcm

On some cases the same error happens when you are dealing with a Raw Image file such CR2. Example: http://www.rawsamples.ch/raws/canon/g10/RAW_CANON_G10.CR2

when you try to run:

byteImg = Image.open("RAW_CANON_G10.CR2")

You will get this error:

OSError: cannot identify image file 'RAW_CANON_G10.CR2'

So you need to convert the image using rawkit first, here is an example how to do it:

from io import BytesIO
from PIL import Image, ImageFile
import numpy
from rawkit import raw
def convert_cr2_to_jpg(raw_image):
    raw_image_process = raw.Raw(raw_image)
    buffered_image = numpy.array(raw_image_process.to_buffer())
    if raw_image_process.metadata.orientation == 0:
        jpg_image_height = raw_image_process.metadata.height
        jpg_image_width = raw_image_process.metadata.width
    else:
        jpg_image_height = raw_image_process.metadata.width
        jpg_image_width = raw_image_process.metadata.height
    jpg_image = Image.frombytes('RGB', (jpg_image_width, jpg_image_height), buffered_image)
    return jpg_image

byteImg = convert_cr2_to_jpg("RAW_CANON_G10.CR2")

Code credit if for mateusz-michalik on GitHub (https://github.com/mateusz-michalik/cr2-to-jpg/blob/master/cr2-to-jpg.py)

来源：https://stackoverflow.com/questions/31077366/pil-cannot-identify-image-file-for-io-bytesio-object

标签

python

pillow

bytesio