php array_intersect() efficiency

前端 未结 5 1765
旧时难觅i
旧时难觅i 2020-12-03 22:38

consider the below script. two arrays with only three values.when i compare these two arrays using array_intersect(). the result is fast.

    

        
相关标签:
5条回答
  • 2020-12-03 23:08

    The fastest solution I found:

    function arrayIntersect($arrayOne, $arrayTwo) {
            $index = array_flip($arrayOne);
            $second = array_flip($arrayTwo);
    
            $x = array_intersect_key($index, $second);
    
            return array_flip($x);
    }
    

    Tests I have made looks like below:

    function intersect($arrayOne, $arrayTwo)
    {
        $index = array_flip($arrayOne);
        foreach ($arrayTwo as $value) {
            if (isset($index[$value])) unset($index[$value]);
        }
        foreach ($index as $value => $key) {
            unset($arrayOne[$key]);
        }
    
        return $arrayOne;
    }
    
    function intersect2($arrayOne, $arrayTwo)
    {
        $index = array_flip($arrayOne);
        $second = array_flip($arrayTwo);
    
        $x = array_intersect_key($index, $second);
    
        return array_flip($x);
    
    }
    
    for($i =0; $i < 1000000; $i++) {
        $one[] = rand(0,1000000);
        $two[] = rand(0,100000);
        $two[] = rand(0,10000);
    }
    
    $one = array_unique($one);
    $two = array_unique($two);
    
    $time_start = microtime(true);
    $res = intersect($one, $two);
    $time = microtime(true) - $time_start;
    
    echo "Sort time $time seconds 'intersect' \n";
    
    
    $time_start = microtime(true);
    $res2 = array_intersect($one, $two);
    $time = microtime(true) - $time_start;
    
    echo "Sort time $time seconds 'array_intersect' \n";
    
    
    $time_start = microtime(true);
    $res3 = intersect2($one, $two);
    $time = microtime(true) - $time_start;
    
    echo "Sort time $time seconds 'intersect2' \n";
    

    Results from php 5.6 :

    Sort time 0.77021193504333 seconds 'intersect' 
    Sort time 6.9765028953552 seconds 'array_intersect' 
    Sort time 0.4631941318512 seconds 'intersect2'
    
    0 讨论(0)
  • 2020-12-03 23:15

    From what you state above, I would recommend you to implement a caching mechanism. That way you would of load the DB and speed up your application. I would also recommend you to profile the speed of array_intersect with increasing amount of data to see how performance scale. You could do this by simply wrapping the call in calls for the system time and calculate the difference. But I would recommend you to use a real profiler to get good data.

    0 讨论(0)
  • 2020-12-03 23:17

    Intersection can be implemented by constructing a set of the searched values in the second array, and looking up in a set can be made so fast that it takes essentially constant time on average. Therefore, the runtime of the whole algorithm can be in O(n).

    Alternatively, one can sort the second array (in O(n log n)). Since looking up in a sorted array has a runtime in O(log n), the whole algorithm should then have a runtime in O(n log n).

    According to a (short, unscientific) test I just ran, this seems to be the case for php's array_intersect:

    Performance of array_intersect

    Here's the code I used to test it. As you can see, for an input size as small as 1000, you don't need to worry.

    0 讨论(0)
  • 2020-12-03 23:28

    array_intersect sorts the arrays before comparing their values in parallel (see the use of zend_qsort in the source file array.c). This alone takes O(n·log n) for each array. Then the actual intersection does only take linear time.

    Depending on the values in your arrays, you could implement this intersection in linear time without the sorting, for example:

    $index = array_flip($arrayOne);
    foreach ($arrayTwo as $value) {
        if (isset($index[$value])) unset($index[$value]);
    }
    foreach ($index as $value => $key) {
        unset($arrayOne[$key]);
    }
    var_dump($arrayOne);
    
    0 讨论(0)
  • 2020-12-03 23:34

    I implementing this simple code of comparing array_intersect and array_intersect_key,

    $array = array();
        for( $i=0; $i<130000; $i++)
            $array[$i] = $i;
        for( $i=200000; $i<230000; $i++)
            $array[$i] = $i;
        for( $i=300000; $i<340000; $i++)
            $array[$i] = $i;
    
        $array2 = array();
        for( $i=100000; $i<110000; $i++)
            $array2[$i] = $i;
        for( $i=90000; $i<100000; $i++)
            $array2[$i] = $i;
        for( $i=110000; $i<290000; $i++)
            $array2[$i] = $i;
    
        echo 'Intersect to arrays -> array1[' . count($array) . '] : array2[' . count($array2) . '] ' . '<br>';
        echo date('Y-m-d H:i:s') . '<br>';
        $time = time();
        $array_r2 = array_intersect_key($array,$array2);
        echo 'Intercept key: ' . (time()-$time) . ' segs<br>';
        $time = time();
        $array_r = array_intersect($array,$array2);
        echo 'Intercept: ' . (time()-$time) . ' segs<br>';
    

    the result....

    Intersect to arrays -> array1[200000] : array2[200000] 
    2014-10-30 08:52:52
    Intercept key: 0 segs
    Intercept: 4 segs
    

    In this comparing of the efficency between array_intersect and array_intersect_key, we can see the interception with keys it is much faster.

    0 讨论(0)
提交回复
热议问题