I have a large amount of JPEG thumbnail images ranging in size from 120x90 to 320x240 and I would like to classify them as either Real Life-like or Cartoon-like.
How mig
One way to discriminate between cartoon and natural scene images is to compare a given image to its "smoothed" self. The motivation behind this is that a "smoothed" cartoon image statistically will not change much, where as a natural scene image will. In other words, take an image, cartoonify (i.e. smooth) it and subtract the result from the original:
isNotACartoonIndex = mean( originalImage - smooth(originalImage) )
This difference (i.e. taking its mean value) will give the level of change caused by the smoothing. The index should be high for non-smooth original (natural scene) images and low for smooth original (cartoony) images.
An SO question already discusses how to cartoonify images.
I would suggest doing the smoothing/cartoonifying with bilateral filtering:
Bilateral filtering can be done with OpenCV using the cvSmooth function with the CV_BILATERAL parameter.
As for subtracting the cartoonyfied image from the original, I would do that with the Hue channel of the HSV images. This means you need to first convert both images from RGB to HSV.
As a side note, wanting to achieve this with an ImageMagick workflow, might be unnecessarily complicated.
This is an Image-classification problem which AFAIK ImageMagick will NOT be able to do.
opencv (which deals with computer vision) might be of more help, for some idea on how an "image classifier" is trained with training data.
As a first pass I would try computing the entropy of the color histogram of the image. Cartoon-like images should have fewer shades of different colors, and thus a lower entropy.
This is similar to what NawaMan proposed, but this method goes one step further. The number of colors over the number of pixels may not be enough. There may be jpeg artifacts, for instance, that artificially increase the number of colors in the image, but only for a few pixels. In this case most pixels in the image would still have very few colors, which would correspond to low entropy.
Let's say you start with an RGB image. For each pixel the R, G, and B values range from 0 to 255.
You can divide this range into n bins, where n can be 16 for example. The you would count how many pixels fall into each one of these 3-dimensional bins. Then you would need to divide the values of the bins by
the total number of pixels, so that your histogram sums up to 1. Then compute the entropy, which
is - sum_i p_i * log(p_i), where p_i is the value of the ith bin.
Try it with different values for n, and see if you can separate the real images from cartoons.
I guess your best bit is the radio between histogram and number of pixel. A cartoon-line image trend to have less number of color then the real-life one.
You can use
COLORS=`convert picture.jpg -format %c histogram:info:- | wc -l`
to count how many colors the picture have. And use a command like:
WIDTH=`jpeginfo picture.jpg | sed -r "s/.* ([0-9]+) x.*/\1/"`
and
HEIGHT=`jpeginfo picture.jpg | sed -r 's/.*x ([0-9]+) .*/\1/'`
to extract width and height.
Then use this command to find the ratio:
echo $WIDTH $HEIGHT $COLORS | awk '{ print $3/($1 * $2);}'
Then it is up to you to define what ratio is qualified as cartoon-like and what is not. For Cartoon-like, the ratio is mostly lower than the real-life one.
Just a thought.
EDIT: I just saw your comment that you don't want to know how just an exiting one. So just ignore my answer then.
EDIT 2: I modify it a bit to make it easier to see.
NOTE 1: You should notice that I swap the ratio as the number of pixels is always much bigger than the number of colors so the previous program results in a lower number. That is why you can hardly distinguish them.
NOTE2: I also change from "jpeginfo" to "identity" as jpeginfo
can only do jpg and it is not a part of ImageMagick.
~/test/CheckCartoon.sh
#!/bin/sh
IMAGE=$1
COLORS=convert $IMAGE -format %c histogram:info:- | wc -l
WIDTH=<b>identify</b> $IMAGE | sed -r "s/.* ([0-9]+)x[0-9]+ .*/\1/"
HEIGHT=<b>identify</b> $IMAGE | sed -r 's/.* [0-9]+x([0-9]+) .*/\1/'
RATIO=echo $WIDTH $HEIGHT $COLORS | awk '{ print <b>($1 * $2)/$3</b>;}'
echo $RATIO | awk '{ printf "%020.5f",$1 }'
~/test/CheckAll.sh
#!/bin/sh
cd images
FILES=ls
for FILE in $FILES; do
IsIMAGE=identify $FILE 2>&1 | grep " no decode delegate " | grep -o "no"
if [ "$IsIMAGE" = "no" ]; then continue; fi
IsIMAGE=`identify $FILE 2>&1 | grep " Improper image header " | grep -o "Improper"`
if [ "$IsIMAGE" = "Improper" ]; then continue; fi
echo `.././CheckCartoon.sh $FILE` $FILE
done
cd ..
Now for testing you copy files here.
Pic 1: ~/test/images/Cartoon-01.jpg
Pic 2: ~/test/images/Cartoon-02.png
Pic 3: ~/test/images/Cartoon-03.gif
Pic 4: ~/test/images/Real-01.jpg
Pic 5: ~/test/images/Real-02.jpg
Pic 6: ~/test/images/Real-03.jpg
http://dl.getdropbox.com/u/1961549/StackOverflow/SO1518347/Images.png
Then I run ./CheckAll.sh | sort
(in test
folder). Here is want I got.
00000000000003.31362 Real-03.jpg
00000000000004.61574 Real-02.jpg
00000000000009.89920 Cartoon-01.jpg
00000000000013.05870 Real-01.jpg
00000000000020.55470 Cartoon-03.gif
00000000000032.21900 Cartoon-02.png
As you can see the result is generally good. You can use number like 15 as a separation.
Cartoon-01.jpg
is a drawing but it looks like a quite realistic one so it may be easily confused. Also Real-01.jpg
is a picture of my girlfriend standing in front of an ocean so the number of colors is less than usual. This come to no surprise why the confusion happens.
What I show you here is still a raw theory. If you really want a conclusive indication you may have to find number of metrics and compare them. For example, the degree of local contrast.
Hope this will helps.