PHP creating a multidimensional array of message threads from a multidimensional array (IMAP)

前端 未结 4 1069
刺人心
刺人心 2021-01-19 03:46

My question is the following:

If you look below you\'ll see there is a datastructure with message ids and then the final datastructure containing the message detail

4条回答
  •  孤街浪徒
    2021-01-19 04:02

    Remember that in php whatever function you use it will be finally converted to some sort of loop. There are, however some steps you could take to make it more efficient and they are different in PHP 5.5 and in 5.3/5.4.

    PHP 5.3/5.4 way

    The most efficient way of doing this would be to split the function to 2 separate steps. In first step you would generate a map of keys for the list of emails.

    $keys = array();
    foreach($emails as $k => $email)
    {
        $keys[$email->msgno] = $k;
    }
    

    In 2nd step you iterate all values in the multi-dimensional $threads and replace them with the email details:

    // Iterate threads
    $threads = array_map(function($thread) use($emails, $keys)
    {
        // Iterate emails in these threads
        return array_map(function($msgno) use($emails, $keys)
        {
            // Swap the msgno with the email details
            return $emails[$keys[$msgno]];
    
        }, $thread);
    
    }, $threads);
    

    Proof of concept: http://pastebin.com/rp5QFN4J

    Explanation of keyword use in anonymous functions:

    In order to make use of variables defined in the parent scope, it is possible to import variables from the parent scope into the closure scope with the use () keyword. Although it was introduced in PHP 5.3 it hasn't been documented in the official PHP manual yet. There's only a draft document on php's wiki here https://wiki.php.net/rfc/closures#userland_perspective

    PHP 5.5

    One of the new features in this version enables you to use generators, which have significantly smaller memory thumbprint thus are more efficient.

    Explanation of keyword yield in generators:

    The heart of a generator function is the yield keyword. In its simplest form, a yield statement looks much like a return statement, except that instead of stopping execution of the function and returning, yield instead provides a value to the code looping over the generator and pauses execution of the generator function.

    1st step:

    function genetateKeyMap($emails)
    {
        foreach($emails as $k => $email)
        {
            // Yielding key => value pair to result set
            yield $email->msgno => $k;
        }
    };
    $keys = iterator_to_array(genetateKeyMap($emails));
    

    2nd step:

    function updateThreads($emails, $threads, $keys)
    {
        foreach($threads as $thread)
        {
            $array = array();
    
            // Create a set of detailed emails
            foreach($thread as $msgno)
            {
                $array[] = $emails[$keys[$msgno]];
            }
    
            // Yielding array to result set
            yield $array;
        }
    };
    $threads = iterator_to_array(updateThreads($emails, $threads, $keys));
    

    A few words about the values being returned by genrators:

    Generators return an object which is an instance of SPL Iterator thus it needs to use iterator_to_array() in order to convert it into exactly the same array structure your code is expecting. You don't need to do this, but it would require an update of your code following the generator function, which could be even more efficient.

    Proof of concept: http://pastebin.com/9Z4pftBH

    Testing Performance:

    I generated a list of 7000 threads with 5 messages each and tested the performance of each method (avg from 5 tests):

                       Takes:       Memory used:
                       ----------------------------
    3x foreach():      2.8s              5.2 MB
    PHP 5.3/5.4 way    0.061s            2.7 MB
    PHP 5.5 way        0.036s            2.7 MB
    

    Although the results on your machine/server might be different but the overview shows that the 2-step method is around 45-77 times faster than using 3 foreach loops

    Test script: http://pastebin.com/M40hf0x7

提交回复
热议问题