I have two images(small and big). One of them contains another one. Something like one image is a photo and another one is a picture of the page of the photoalbum where this
Use ImageMagick.
This page will give you answer: How can I detect / calculate if a small pictures is present inside a bigger picture?
It is quite easy to do on your own, without relying on external libs other than gd
.
What you need to be aware of, is that you most likely cannot do a simple pixel per pixel check, as filtering and compression might slightly modify the value of each pixel.
The code I am proposing here will most likely be slow, if performance is a concern, you could optimize it or take shortcuts. Hopefully, the code puts you on the right track!
First, lets iterate on our pictures
$small = imagecreatefrompng("small.png");
$large = imagecreatefrompng("large.png");
$smallwidth = imagesx($small);
$smallheight = imagesy($small);
$largewidth = imagesx($large);
$largeheight = imagesy($large);
$foundX = -1;
$foundY = -1;
$keepThreshold = 20;
$potentialPositions = array();
for($x = 0; $x <= $largewidth - $smallwidth; ++$x)
{
for($y = 0; $y <= $largeheight - $smallheight; ++$y)
{
// Scan the whole picture
$error = GetImageErrorAt($large, $small, $x, $y);
if($error["avg"] < $keepThreshold)
{
array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
}
}
}
imagedestroy($small);
imagedestroy($large);
echo "Found " . count($potentialPositions) . " potential positions\n";
The goal here is to find how similar the pixels are, and if they are somewhat similar, keep the potential position. Here, I iterate each and every pixel of the large picture, this could be a point of optimization.
Now, where does this error come from?
Getting the likeliness
What I did here is iterate over the small picture and a "window" in the large picture checking how much difference there was on the red
, green
and blue
channel:
function GetImageErrorAt($haystack, $needle, $startX, $startY)
{
$error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0);
$needleWidth = imagesx($needle);
$needleHeight = imagesy($needle);
for($x = 0; $x < $needleWidth; ++$x)
{
for($y = 0; $y < $needleHeight; ++$y)
{
$nrgb = imagecolorat($needle, $x, $y);
$hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);
$nr = $nrgb & 0xFF;
$hr = $hrgb & 0xFF;
$error["red"] += abs($hr - $nr);
$ng = ($nrgb >> 8) & 0xFF;
$hg = ($hrgb >> 8) & 0xFF;
$error["green"] += abs($hg - $ng);
$nb = ($nrgb >> 16) & 0xFF;
$hb = ($hrgb >> 16) & 0xFF;
$error["blue"] += abs($hb - $nb);
}
}
$error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);
return $error;
}
So far, we've established a potential error value for every "window" in the large picture that could contain the small picture, and store them in an array if they seem "good enough".
Sorting
Now, we simply need to sort our best matches and keep the best one, it is most likely where our small picture is located:
function SortOnAvgError($a, $b)
{
if($a["error"]["avg"] == $b["error"]["avg"])
{
return 0;
}
return ($a["error"]["avg"] < $b["error"]["avg"]) ? -1 : 1;
}
if(count($potentialPositions) > 0)
{
usort($potentialPositions, "SortOnAvgError");
$mostLikely = $potentialPositions[0];
echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"];
}
Example
Given the two following pictures:
and
You should have the following result:
Found 5 potential positions
Most likely at 288,235
Which corresponds exactly with the position of our duck. The 4 other positions are 1 pixel up, down, left and right.
I am going to edit this entry after I'm done working on some optimizations for you, as this code is way too slow for big images (PHP performed even worse than I expected).
Edit
First, before doing anything to "optimize" the code, we need numbers, so I added
function microtime_float()
{
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}
$time_start = microtime_float();
and
$time_end = microtime_float();
echo "in " . ($time_end - $time_start) . " seconds\n";
at the end to have a specific idea of how much time is taken during the algorithm. This way, I can know if my changes improve or make the code worse. Given that the current code with these pictures takes ~45 minutes to execute, we should be able to improve this time quite a lot.
A tentative that was not succesful, was to cache the RGB
from the $needle
to try to accelerate the GetImageErrorAt
function, but it worsened the time.
Given that our computation is on a geometric scale, the more pixels we explore, the longer it will take... so a solution is to skip many pixels to try to locate as fast as possible our picture, and then more accurately zone in on our position.
I modified the error function to take as a parameter how to increment the x
and y
function GetImageErrorAt($haystack, $needle, $startX, $startY, $increment)
{
$needleWidth = imagesx($needle);
$needleHeight = imagesy($needle);
$error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0, "complete" => true);
for($x = 0; $x < $needleWidth; $x = $x + $increment)
{
for($y = 0; $y < $needleHeight; $y = $y + $increment)
{
$hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);
$nrgb = imagecolorat($needle, $x, $y);
$nr = $nrgb & 0xFF;
$hr = $hrgb & 0xFF;
$ng = ($nrgb >> 8) & 0xFF;
$hg = ($hrgb >> 8) & 0xFF;
$nb = ($nrgb >> 16) & 0xFF;
$hb = ($hrgb >> 16) & 0xFF;
$error["red"] += abs($hr - $nr);
$error["green"] += abs($hg - $ng);
$error["blue"] += abs($hb - $nb);
}
}
$error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);
return $error;
}
For example, passing 2
will make the function return 4 times faster, as we skip both x
and y
values.
I also added a stepSize
for the main loop:
$stepSize = 10;
for($x = 0; $x <= $largewidth - $smallwidth; $x = $x + $stepSize)
{
for($y = 0; $y <= $largeheight - $smallheight; $y = $y + $stepSize)
{
// Scan the whole picture
$error = GetImageErrorAt($large, $small, $x, $y, 2);
if($error["complete"] == true && $error["avg"] < $keepThreshold)
{
array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
}
}
}
Doing this, I was able to reduce the execution time from 2657 seconds to 7 seconds at a price of precision. I increased the keepThreshold
to have more "potential results".
Now that I wasn't checking each pixels, my best answer is:
Found 8 potential positions
Most likely at 290,240
As you can see, we're near our desired position, but it's not quite right.
What I'm going to do next is define a rectangle around this "pretty close" position to explore every pixel inside the stepSize
we added.
I'm now changing the lower part of the script for:
if(count($potentialPositions) > 0)
{
usort($potentialPositions, "SortOnAvgError");
$mostLikely = $potentialPositions[0];
echo "Most probably around " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";
$startX = $mostLikely["x"] - $stepSize + 1; // - $stepSize was already explored
$startY = $mostLikely["y"] - $stepSize + 1; // - $stepSize was already explored
$endX = $mostLikely["x"] + $stepSize - 1;
$endY = $mostLikely["y"] + $stepSize - 1;
$refinedPositions = array();
for($x = $startX; $x <= $endX; ++$x)
{
for($y = $startY; $y <= $endY; ++$y)
{
// Scan the whole picture
$error = GetImageErrorAt($large, $small, $x, $y, 1); // now check every pixel!
if($error["avg"] < $keepThreshold) // make the threshold smaller
{
array_push($refinedPositions, array("x" => $x, "y" => $y, "error" => $error));
}
}
}
echo "Found " . count($refinedPositions) . " refined positions\n";
if(count($refinedPositions))
{
usort($refinedPositions, "SortOnAvgError");
$mostLikely = $refinedPositions[0];
echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";
}
}
Which now gives me an output like:
Found 8 potential positions
Most probably around 290,240
Checking between X 281 and 299
Checking between Y 231 and 249
Found 23 refined positions
Most likely at 288,235
in 13.960182189941 seconds
Which is indeed the right answer, roughly 200 times faster than the initial script.
Edit 2
Now, my test case was a bit too simple... I changed it to a google image search:
Looking for this picture (it's located at 718,432
)
Considering the bigger picture sizes, we can expect a longer processing time, but the algorithm did find the picture at the right position:
Found 123 potential positions
Most probably around 720,430
Found 17 refined positions
Most likely at 718,432
in 43.224536895752 seconds
Edit 3
I decided to try the option I told you in the comment, to scale down the pictures before executing the find, and I had great results with it.
I added this code before the first loop:
$smallresizedwidth = $smallwidth / 2;
$smallresizedheight = $smallheight / 2;
$largeresizedwidth = $largewidth / 2;
$largeresizedheight = $largeheight / 2;
$smallresized = imagecreatetruecolor($smallresizedwidth, $smallresizedheight);
$largeresized = imagecreatetruecolor($largeresizedwidth, $largeresizedheight);
imagecopyresized($smallresized, $small, 0, 0, 0, 0, $smallresizedwidth, $smallresizedheight, $smallwidth, $smallheight);
imagecopyresized($largeresized, $large, 0, 0, 0, 0, $largeresizedwidth, $largeresizedheight, $largewidth, $largeheight);
And for them main loop I iterated on the resized assets with the resized width and height. Then, when adding to the array, I double the x
and y
, giving the following:
array_push($potentialPositions, array("x" => $x * 2, "y" => $y * 2, "error" => $error));
The rest of the code remains the same, as we want to do the precise location on the real size pictures. All you have to do is add at the end:
imagedestroy($smallresized);
imagedestroy($largeresized);
Using this version of the code, with the google image result, I had:
Found 18 potential positions
Most around 720,440
Found 17 refined positions
Most likely at 718,432
in 11.499078989029 seconds
A 4 times performance increase!
Hope this helps