Determine position of one image in another with PHP

后端 未结 2 1021
孤独总比滥情好
孤独总比滥情好 2021-01-06 15:07

I have two images(small and big). One of them contains another one. Something like one image is a photo and another one is a picture of the page of the photoalbum where this

相关标签:
2条回答
  • 2021-01-06 15:41

    Use ImageMagick.

    This page will give you answer: How can I detect / calculate if a small pictures is present inside a bigger picture?

    0 讨论(0)
  • 2021-01-06 15:52

    It is quite easy to do on your own, without relying on external libs other than gd.

    What you need to be aware of, is that you most likely cannot do a simple pixel per pixel check, as filtering and compression might slightly modify the value of each pixel.

    The code I am proposing here will most likely be slow, if performance is a concern, you could optimize it or take shortcuts. Hopefully, the code puts you on the right track!

    First, lets iterate on our pictures

    $small = imagecreatefrompng("small.png");
    $large = imagecreatefrompng("large.png");
    
    $smallwidth = imagesx($small);
    $smallheight = imagesy($small);
    
    $largewidth = imagesx($large);
    $largeheight = imagesy($large);
    
    $foundX = -1;
    $foundY = -1;
    
    $keepThreshold = 20;
    
    $potentialPositions = array();
    
    for($x = 0; $x <= $largewidth - $smallwidth; ++$x)
    {
        for($y = 0; $y <= $largeheight - $smallheight; ++$y)
        {
            // Scan the whole picture
            $error = GetImageErrorAt($large, $small, $x, $y);
            if($error["avg"] < $keepThreshold)
            {
                array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
            }
        }
    }
    
    imagedestroy($small);
    imagedestroy($large);
    
    echo "Found " . count($potentialPositions) . " potential positions\n";
    

    The goal here is to find how similar the pixels are, and if they are somewhat similar, keep the potential position. Here, I iterate each and every pixel of the large picture, this could be a point of optimization.

    Now, where does this error come from?

    Getting the likeliness

    What I did here is iterate over the small picture and a "window" in the large picture checking how much difference there was on the red, green and blue channel:

    function GetImageErrorAt($haystack, $needle, $startX, $startY)
    {
        $error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0);
        $needleWidth = imagesx($needle);
        $needleHeight = imagesy($needle);
    
        for($x = 0; $x < $needleWidth; ++$x)
        {
            for($y = 0; $y < $needleHeight; ++$y)
            {
                $nrgb = imagecolorat($needle, $x, $y);
                $hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);
    
                $nr = $nrgb & 0xFF;
                $hr = $hrgb & 0xFF;
    
                $error["red"] += abs($hr - $nr);
    
                $ng = ($nrgb >> 8) & 0xFF;
                $hg = ($hrgb >> 8) & 0xFF;
    
                $error["green"] += abs($hg - $ng);
    
                $nb = ($nrgb >> 16) & 0xFF;
                $hb = ($hrgb >> 16) & 0xFF;
    
                $error["blue"] += abs($hb - $nb);
            }
        }
        $error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);
        return $error;
    }
    

    So far, we've established a potential error value for every "window" in the large picture that could contain the small picture, and store them in an array if they seem "good enough".

    Sorting

    Now, we simply need to sort our best matches and keep the best one, it is most likely where our small picture is located:

    function SortOnAvgError($a, $b)
    {
        if($a["error"]["avg"] == $b["error"]["avg"])
        {
            return 0;
        }
        return ($a["error"]["avg"] < $b["error"]["avg"]) ? -1 : 1;
    }
    
    if(count($potentialPositions) > 0)
    {
        usort($potentialPositions, "SortOnAvgError");
        $mostLikely = $potentialPositions[0];
        echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"];
    }
    

    Example

    Given the two following pictures:

    Large

    and

    Small

    You should have the following result:

    Found 5 potential positions
    Most likely at 288,235
    

    Which corresponds exactly with the position of our duck. The 4 other positions are 1 pixel up, down, left and right.

    I am going to edit this entry after I'm done working on some optimizations for you, as this code is way too slow for big images (PHP performed even worse than I expected).

    Edit

    First, before doing anything to "optimize" the code, we need numbers, so I added

    function microtime_float()
    {
        list($usec, $sec) = explode(" ", microtime());
        return ((float)$usec + (float)$sec);
    }
    
    $time_start = microtime_float();
    

    and

    $time_end = microtime_float();
    echo "in " . ($time_end - $time_start) . " seconds\n";
    

    at the end to have a specific idea of how much time is taken during the algorithm. This way, I can know if my changes improve or make the code worse. Given that the current code with these pictures takes ~45 minutes to execute, we should be able to improve this time quite a lot.

    A tentative that was not succesful, was to cache the RGB from the $needle to try to accelerate the GetImageErrorAt function, but it worsened the time.

    Given that our computation is on a geometric scale, the more pixels we explore, the longer it will take... so a solution is to skip many pixels to try to locate as fast as possible our picture, and then more accurately zone in on our position.

    I modified the error function to take as a parameter how to increment the x and y

    function GetImageErrorAt($haystack, $needle, $startX, $startY, $increment)
    {
        $needleWidth = imagesx($needle);
        $needleHeight = imagesy($needle);
    
        $error = array("red" => 0, "green" => 0, "blue" => 0, "avg" => 0, "complete" => true);
    
        for($x = 0; $x < $needleWidth; $x = $x + $increment)
        {
            for($y = 0; $y < $needleHeight; $y = $y + $increment)
            {
                $hrgb = imagecolorat($haystack, $x + $startX, $y + $startY);
                $nrgb = imagecolorat($needle, $x, $y);
    
                $nr = $nrgb & 0xFF;
                $hr = $hrgb & 0xFF;
    
                $ng = ($nrgb >> 8) & 0xFF;
                $hg = ($hrgb >> 8) & 0xFF;
    
                $nb = ($nrgb >> 16) & 0xFF;
                $hb = ($hrgb >> 16) & 0xFF;
    
                $error["red"] += abs($hr - $nr);
                $error["green"] += abs($hg - $ng);
                $error["blue"] += abs($hb - $nb);
            }
        }
    
        $error["avg"] = ($error["red"] + $error["green"] + $error["blue"]) / ($needleWidth * $needleHeight);
    
        return $error;
    }
    

    For example, passing 2 will make the function return 4 times faster, as we skip both x and y values.

    I also added a stepSize for the main loop:

    $stepSize = 10;
    
    for($x = 0; $x <= $largewidth - $smallwidth; $x = $x + $stepSize)
    {
        for($y = 0; $y <= $largeheight - $smallheight; $y = $y + $stepSize)
        {
            // Scan the whole picture
            $error = GetImageErrorAt($large, $small, $x, $y, 2);
            if($error["complete"] == true && $error["avg"] < $keepThreshold)
            {
                array_push($potentialPositions, array("x" => $x, "y" => $y, "error" => $error));
            }
        }
    }
    

    Doing this, I was able to reduce the execution time from 2657 seconds to 7 seconds at a price of precision. I increased the keepThreshold to have more "potential results".

    Now that I wasn't checking each pixels, my best answer is:

    Found 8 potential positions
    Most likely at 290,240
    

    As you can see, we're near our desired position, but it's not quite right.

    What I'm going to do next is define a rectangle around this "pretty close" position to explore every pixel inside the stepSize we added.

    I'm now changing the lower part of the script for:

    if(count($potentialPositions) > 0)
    {
        usort($potentialPositions, "SortOnAvgError");
        $mostLikely = $potentialPositions[0];
        echo "Most probably around " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";
    
        $startX = $mostLikely["x"] - $stepSize + 1; // - $stepSize was already explored
        $startY = $mostLikely["y"] - $stepSize + 1; // - $stepSize was already explored
    
        $endX = $mostLikely["x"] + $stepSize - 1;
        $endY = $mostLikely["y"] + $stepSize - 1;
    
        $refinedPositions = array();
    
        for($x = $startX; $x <= $endX; ++$x)
        {
            for($y = $startY; $y <= $endY; ++$y)
            {
                // Scan the whole picture
                $error = GetImageErrorAt($large, $small, $x, $y, 1); // now check every pixel!
                if($error["avg"] < $keepThreshold) // make the threshold smaller
                {
                    array_push($refinedPositions, array("x" => $x, "y" => $y, "error" => $error));
                }
            }
        }
    
        echo "Found " . count($refinedPositions) . " refined positions\n";
        if(count($refinedPositions))
        {
            usort($refinedPositions, "SortOnAvgError");
            $mostLikely = $refinedPositions[0];
            echo "Most likely at " . $mostLikely["x"] . "," . $mostLikely["y"] . "\n";
        }
    }
    

    Which now gives me an output like:

    Found 8 potential positions
    Most probably around 290,240
    Checking between X 281 and 299
    Checking between Y 231 and 249
    Found 23 refined positions
    Most likely at 288,235
    in 13.960182189941 seconds
    

    Which is indeed the right answer, roughly 200 times faster than the initial script.

    Edit 2

    Now, my test case was a bit too simple... I changed it to a google image search:

    Google Image Search

    Looking for this picture (it's located at 718,432)

    The Duck

    Considering the bigger picture sizes, we can expect a longer processing time, but the algorithm did find the picture at the right position:

    Found 123 potential positions
    Most probably around 720,430
    Found 17 refined positions
    Most likely at 718,432
    in 43.224536895752 seconds
    

    Edit 3

    I decided to try the option I told you in the comment, to scale down the pictures before executing the find, and I had great results with it.

    I added this code before the first loop:

    $smallresizedwidth = $smallwidth / 2;
    $smallresizedheight = $smallheight / 2;
    
    $largeresizedwidth = $largewidth / 2;
    $largeresizedheight = $largeheight / 2;
    
    $smallresized = imagecreatetruecolor($smallresizedwidth, $smallresizedheight);
    $largeresized = imagecreatetruecolor($largeresizedwidth, $largeresizedheight);
    
    imagecopyresized($smallresized, $small, 0, 0, 0, 0, $smallresizedwidth, $smallresizedheight, $smallwidth, $smallheight);
    imagecopyresized($largeresized, $large, 0, 0, 0, 0, $largeresizedwidth, $largeresizedheight, $largewidth, $largeheight);
    

    And for them main loop I iterated on the resized assets with the resized width and height. Then, when adding to the array, I double the x and y, giving the following:

    array_push($potentialPositions, array("x" => $x * 2, "y" => $y * 2, "error" => $error));
    

    The rest of the code remains the same, as we want to do the precise location on the real size pictures. All you have to do is add at the end:

    imagedestroy($smallresized);
    imagedestroy($largeresized);
    

    Using this version of the code, with the google image result, I had:

    Found 18 potential positions
    Most around 720,440
    Found 17 refined positions
    Most likely at 718,432
    in 11.499078989029 seconds
    

    A 4 times performance increase!

    Hope this helps

    0 讨论(0)
提交回复
热议问题