Let's look at the problems at the algorithmic level.
Your get_image_data()
does not handle the PPM format (Netpbm P6 format) correctly. Just like the other binary Netpbm formats -- PBM, PGM, PPM, PNM --, the P6 format can have comments before the maximum component value (which is followed by exactly one newline, \0
, followed by the binary data).
(Although the Wikipedia Netpbm format article says that a comment is possible even after the maximum component value, that makes the binary formats ambiquous, as a parser cannot tell whether a #
(binary \x23
) is part of the image data or the start of a comment. So, a lot of utilities do not allow a comment after the last header value at all, to keep the formats unambiguous.)
To parse the binary Netpbm formats correctly in C, you need to read the two first characters of the file or stream first, to detect the format. The rest of the header values are all nonnegative integers, and can be scanned using a single function that also skips comment lines. If we use the C I/O facilities, then we can write that function easily using the one-character pushback facility; in pseudocode,
Function pnm_value(stream):
Read one character from stream into c
Loop:
If c == EOF:
Premature end of input; fail.
If c == '#':
Loop:
Read one character from stream into c
If c is not EOF or '\n', break loop
End loop
Continue at the start of the outer loop
If c is a '\t', '\n', '\v', '\f', '\r', or ' ':
Read one character from stream into c
Continue at the start of the outer loop
Otherwise break loop
End loop
If c is not a digit:
Invalid input; fail
Value = 0
While c is a digit:
OldValue = Value
Value = 10*value + (value of digit c)
If (Value / 10 != OldValue):
Value is too large; fail
Read one character from stream into c
End While
If c is not EOF:
Push (unget) c back to stream
Return Value
End function
After you have read the header fields using the above function, for binary formats you should read one more character from the stream or file, and it must be a newline \n
for the format to be valid (and unambiguous).
Binary data can be read in C using getc(stream)
; there is no need to use fread()
. This is faster, because getc()
is often a macro (that may evaluate its argument, stream
, more than once; it does not harm anything in this particular case).
For the P6 format, if the maxval
field in the header (the third value, after width
and height
in pixels) is at most 255, there are width
×height
x3 chars of data; red component first, then the green, and finally blue.
If the maxval
field is 256 to 65535, there are width
×height
×6 chars of data in the P6 format. In each set of six characters, the first two are red, the next two green, and the last two blue components; with the most significant byte first.
For High Dynamic Range images, including exploration of different color spaces, I recommend using a data structure with 64 bits per pixel, 20 bits per component. For example,
typedef struct {
size_t width;
size_t height;
size_t stride; /* Usually == width */
uint64_t *pixel; /* i = y*stride + x */
void *data; /* Origin of allocated pixel data */
} image;
A separate stride lets you allocate the pixel map with extra pixels, in case you wish to e.g. apply a filter kernel to the data; then you do not need to handle border pixels in any special way, just initialize them to appropriate colors (duplicating the image edge pixels, typically).
When reading PNM files into the above data structure, instead of saving whatever value you read from the file, you compute
component = (1048575 * file_component) / maxvalue;
for each color component read from the file. This ensures that you always have component values between 0 and 1048575 for each component, regardless of the precision of the component saved in the file.
In practice, to read a pixel from a P6/PPM file into a 64-bit, 20 bits per component pixel value, you could use e.g.
uint64_t pixel;
uint64_t red, green, blue;
if (maxval > 255) {
red = (getc(stream) & 255) << 8;
red += getc(stream) & 255;
green = (getc(stream) & 255) << 8;
green += getc(stream) & 255;
blue = (getc(stream) & 255) << 8;
blue += getc(stream) & 255;
} else {
red = getc(stream) & 255;
green = getc(stream) & 255;
blue = getc(stream) & 255;
}
pixel = ((uint64_t)((1048575 * red) / maxval) << 40)
| ((uint64_t)((1048575 * green) / maxval) << 20)
| (uint64_t)((1048575 * blue) / maxval);
In your particular case, this is not really important, and indeed you could just read the entire data (3*width*height
chars if maxval<=255
, 6*width*height
chars if maxval>=256
) as is, without conversion.
There is no need to convert the image data to another color model explicitly: you can compute the histograms while you read the file, and adjust the colors when writing the output file.
Histogram equalization is an operation where each color component for each pixel is scaled separately, using a simple function that makes the histograms as flat as possible. You can find more practical examples and explanations (like this PDF) with your favourite search engine.
When you read the red, green, and blue components for a pixel, and scale them to the 0..1048575 range (inclusive), you can calculate the Y/Cb/Cr and H/S/I using the formulae shown on their respective Wikipedia articles, for example. You can do the calculations using integers or floats, but remember that you need to decide the size of your histograms (and therefore eventually convert each component to integer). To avoid quantization error in the color conversions, you should use more bits per component in these "temporary" colorspaces -- say, 24 bits sounds good.
Whichever colorspace you do use for histogram equalization, you most likely end up converting the histogram into a component mapping; that is, rather than element c[i]
describing the number of pixels having this color component of value i
, you transform it so that c[i]
yields the equalized color component value for original color component value i
.
When you have the three color component mappings, you can save the output file.
For each pixel, you convert the red, green, and blue components to the colorspace you use for the histogram equalization. You map each of the components separately. Then, you convert the color components back to the RGB model, and finally save the pixel red, green, and blue components.
If the original file used a maxval of 255 or less, save the file using a maxval of 255 (and one char per color component). If the original file used a larger maxval, use a maxval of 65535 (and two chars per color component; most significant byte first). Or, better yet, let the user specify the resulting maxval at run time.
If the input is from a file, you don't even need to remember the pixel data for the image, as you can simply read it twice.
Note, however, that most utilities that process Netpbm files are written to allow easy piping. Indeed, that is the most common type of use that I show my fellow users that need e.g. manipulate specific colors or gray levels in an image. Because of this, it is typically recommended to keep the pixel data in memory, and write all errors and information to standard error only.
I would estimate that counting in SLOC, your program will mostly consist of the code needed to parse command-line arguments, read the input file, and write the output file. The colorspace conversions are not difficult nor long, and the histogram stuff is near trivial. (After all, you are just counting how many times a specific color component appears in the image.)
Even so, it is most important that you write your program one step at a time. For one, it limits the region of code you need to inspect when a bug occurs.
Rather than working on a single program, I like to use temporary test programs (some might call these unit tests) to implement each part separately, before combining them into the program proper. In your case, I would definitely write the read-PPM-P6-image and write-PPM-P6-image functions first, and test them, for example by rotating the image 180 degrees (so upper left corner will become the lower right corner), or something similar. When you get it working, and you can open your generated PPM/P6 images in Gimp, Netpbm tools, eog, or whatever applications and utilities you might use, only then progress to the rest of the problem.
Also, make your code easy to read. That means consistent indentation. And lots of comments: NOT describing what the code does, but describing what problem the code tries to solve; what task it tries to accomplish.
As it stands, the code shown in your post is a mismash of stuff. You do not even have a clear question in your 'question'! If you progress step by step, implementing and testing each part separately, and don't let your code become an ugly mess, you'll never end up in that situation. Instead, you can ask intelligent questions like how to best merge your disparate parts, if you get lost. (Often that involves rewriting a part using a different view, different "paradigm", but that is a good thing, because then you learn why different views and different tools are useful in different situations, and how to determine the situation.)