So I\'m designing a few programs for editing photos in python
using PIL
and one of them was converting an image to greyscale (I\'m avoiding the use of
The most obvious example:
Original
Desaturated in Gimp (Lightness mode - this is what your algorithm does)
Desaturated in Gimp (Luminosity mode - this is what our eyes do)
So, don't average RGB. Averaging RGB is simply wrong!
(Okay, you're right, averaging might be valid in some obscure applications, even though it has no physical or physiological meaning when RGB values are treated as color. By the way, the "regular" way of doing weighted averaging is also incorrect in a more subtle way because of gamma. sRGB should be first linearized and then the final result converted back to sRGB (which would be equivalent of retrieving the L component in the Lab color space))
There are many different methods for converting to greyscale, and they do give different results though the differences might be easier to see with different input colour images.
As we don't really see in greyscale, the "best" method is somewhat dependent on the application and somewhat in the eye of the beholder.
The alternative formula you refer to is based on the human eye being more sensitive to variations in green tones and therefore giving them a bigger weighting - similarly to a Bayer array in a camera where there are 2 green pixels for each red and blue one. Wiki - Bayer array
The answers provided are enough, but I want to discuss a bit more on this topic in a different manner.
Since I learnt digital painting for interest, more often I use HSV.
It is much more controllable for using HSV during painting, but keep it short, the main point is the S: Saturation separating the concept of color from the light. And turning S to 0, is already the 'computer' grey scale of image.
from PIL import Image
import colorsys
def togrey(img):
if isinstance(img,Image.Image):
r,g,b = img.split()
R = []
G = []
B = []
for rd,gn,bl in zip(r.getdata(),g.getdata(),b.getdata()) :
h,s,v = colorsys.rgb_to_hsv(rd/255.,gn/255.,bl/255.)
s = 0
_r,_g,_b = colorsys.hsv_to_rgb(h,s,v)
R.append(int(_r*255.))
G.append(int(_g*255.))
B.append(int(_b*255.))
r.putdata(R)
g.putdata(G)
b.putdata(B)
return Image.merge('RGB',(r,g,b))
else:
return None
a = Image.open('../a.jpg')
b = togrey(a)
b.save('../b.jpg')
This method truly reserved the 'bright' of original color. However, without considering how human eye process the data.
You can use any conversion equation, scale, linearity. The one you found:
I = 0.299 R + 0.587 G + 0.114 B
is based on average human eye "average" primary color (R,G,B) perception sensitivity (at least for the time period and population/HW it was created on; bear in mind those standards were created before LED,TFT, etc. screens).
There are several problems you are fighting against:
our eyes are not the same
All humans do not perceive color the same way. There are major discrepancies between genders and smaller also between regions; even generation and age play a role. So even an average should be handled as "average".
We have different sensitivity to intensity of light across the visible spectrum. The most sensitive color is green (hence the highest weight on it). But the XYZ curve peaks can be at different wavelengths for different people (like me I got them shifted a bit causing difference in recognition of certain wavelengths like some shades of Aqua - some see them as green some as blue even if none of them have any color blindness disabilities or whatever).
monitors do not use the same wavelengths nor spectral dispersion
So if you take 2 different monitors, they might use slightly different wavelengths for R, G, B or even different widths of the spectral filter (just use a spectroscope and see). Yes they should be "normalized" by the HW but that is not the same as using normalized wavelengths. It is similar to problems using RGB vs. White Noise spectrum light sources.
monitor linearity
Humans do not see on a linear scale: we are usually logarithmic/exponential (depends how you look at it) so yes we can normalize that with HW (or even SW) but the problem is if we linearize for one human then means we damage it for another.
If you take all this together you can either use averages ... or special (and expensive) equipment to measure/normalize against some standard or against a calibrated person (depends on the industry).
But that is too much to handle in home conditions so leave all that for industry and use the weights for "average" like most of the world... Luckily our brain can handle it as you cannot see the difference unless you start comparing both images side by side or in an animation :). So I (would) do:
I = 0.299 R + 0.587 G + 0.114 B
R = I
G = I
B = I
The images look pretty similar, but your eye can tell the difference, specially if you put one in place of the other:
For example, you can note that the flowers in the background look brighter in the averaging conversion.
It is not that there is anything intrinsically "bad" about averaging the three channels. The reason for that formula is that we do not perceive red, green and blue equally, so their contributions to the intensities in a grayscale image shouldn't be the same; since we perceive green more intensely, green pixels should look brighter on grayscale. However, as commented by Mark there is no unique perfect conversion to grayscale, since we see in color, and in any case everyone's vision is slightly different, so any formula will just try to make an approximation so pixel intensities feel "right" for most people.
There are many formulas for the Luminance, depending on the R,G,B color primaries:
Rec.601/NTSC: Y = 0.299*R + 0.587*G + 0.114*B ,
Rec.709/EBU: Y = 0.213*R + 0.715*G + 0.072*B ,
Rec.2020/UHD: Y = 0.263*R + 0.678*G + 0.059*B .
This is all because our eyes are less sensitive to blue than to red than to green.
That being said, you are probably calculating Luma, not Luminance, so the formulas are all wrong anyway. For Constant-Luminance you must convert to linear-light
R = R' ^ 2.4 , G = G' ^ 2.4 , B = B' ^ 2.4 ,
apply the Luminance formula, and convert back to the gamma domain
Y' = Y ^ (1/2.4) .
Also, consider that converting a 3D color space to a 1D quantity loses 2/3 of the information, which can bite you in the next processing steps. Depending on the problem, sometimes a different formula is better, like V = MAX(R,G,B) (from HSV color space).
How do I know? I'm a follower and friend of Dr. Poynton.