问题
Looking at the PNG specification, it appears that the PNG pixel data chunk starts with IDAT
and ends with IEND
(slightly clearer explanation here). In the middle are values that don't make sense to make sense to me.
How can I get usable RGB values from this, without using any libraries (ie from the raw binary file)?
As an example, I made a 2x2px image with 4 black rgb(0,0,0)
pixels in Photoshop:
Here's the resulting data (in the raw binary input, the hex values, and the human-readable ASCII):
BINARY HEX ASCII
01001001 49 'I'
01000100 44 'D'
01000001 41 'A'
01010100 54 'T'
01111000 78 'x'
11011010 DA '\xda'
01100010 62 'b'
01100000 60 '`'
01000000 40 '@'
00000110 06 '\x06'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
11111111 FF '\xff'
11111111 FF '\xff'
00000011 03 '\x03'
00000000 00 '\x00'
00000000 00 '\x00'
00001110 0E '\x0e'
00000000 00 '\x00'
00000001 01 '\x01'
10000011 83 '\x83'
11010100 D4 '\xd4'
11101100 EC '\xec'
10001110 8E '\x8e'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
00000000 00 '\x00'
01001001 49 'I'
01000101 45 'E'
01001110 4E 'N'
01000100 44 'D'
回答1:
You missed a rather crucial detail in both the specifications:
The official one:
.. The IDAT chunk contains the actual image data which is the output stream of the compression algorithm.
[...]
Deflate-compressed datastreams within PNG are stored in the "zlib" format.
Wikipedia:
IDAT contains the image, which may be split among multiple IDAT chunks. Such splitting increases filesize slightly, but makes it possible to generate a PNG in a streaming manner. The IDAT chunk contains the actual image data, which is the output stream of the compression algorithm.
Both state the raw image data is compressed. Looking at your data, the first 2 bytes
78 DA
contain the compression flags as specified in RFC1950. The rest of the data is compressed.
Decompressing this with a general zlib
compatible routine show 14 bytes of output:
00 00 00 00 00 00 00
00 00 00 00 00 00 00
where each first byte is the PNG row filter (0 for both rows), followed by 2 RGB triplets (0,0,0), for the 2 lines of your image.
"Without using any libraries" you need 3 separate routines to:
- read and parse the PNG superstructure; this provides the
IDAT
compressed data, as well as essential information such as width, height, and color depth; - decompress the
zlib
part(s) into raw binary data; - parse the decompressed data, handling Adam-7 interlacing if required, and applying row filters.
Only after performing these three steps you will have access to the raw image data. Of these, you seem to have a good grasp of step (1). Step (2) is way harder to "do" yourself; personally, I cheated and used miniz in my own PNG handling programs. Step 3, again, is merely a question of determination. All the necessary bits of information can be found on the web, but it takes a while to put everything in the right order. (Just recently I found an error in my execution of the rarely used Paeth row filter--it went unnoticed because it is fairly rarely used in 'real world' images.)
See Building a fast PNG encoder issues for a similar discussion and Trying to understand zlib/deflate in PNG files for an in-depth look into the Deflate scheme.
来源:https://stackoverflow.com/questions/26456447/interpret-png-pixel-data