What exactly are the benefits of using a PHP 5 DirectoryIterator over PHP 4 “opendir/readdir/closedir”?

前端 未结 4 1301
迷失自我
迷失自我 2021-02-04 09:03

What exactly are the benefits of using a PHP 5 DirectoryIterator

$dir = new DirectoryIterator(dirname(__FILE__));
foreach ($dir as $fileinfo) 
{
    // handle wh         


        
相关标签:
4条回答
  • 2021-02-04 09:55

    A DirectoryIterator provides you with items that make sense in themselves. For example, DirectoryIterator::getPathname() will return all the information that you need to access the file contents.

    The information that readdir() provides to you only make sense locally, namely in combination with the parameter that you passed to opendir().

    The DirectoryIterator is implemented in terms of wrappers around the php_stream_* functions, so no fundamentally different performance characteristics are to be expected. Particularly, items from the directory are read only when they are requested. Details can be found in the file

    ext/spl/spl_directory.c

    of the PHP source code.

    0 讨论(0)
  • 2021-02-04 09:57

    Benefit 1: You can hide away all the boring details.

    When using iterators you generally define them somewhere else, so real-life code would look something more like:

    // ImageFinder is an abstraction over an Iterator
    $images = new ImageFinder($base_directory);
    foreach ($images as $image) {
        // application logic goes here.
    }
    

    The specifics of iterating through directories, sub-directories and filtering out unwanted items are all hidden from the application. That's probably not the interesting part of your application anyway, so it's nice to be able to hide those bits away somewhere else.

    Benefit 2: What you do with the result is separated from obtaining the result.

    In the above example, you could swap out that specific iterator for another iterator and you don't have to change what you do with the result at all. This makes the code a bit easier to maintain and add new features to later on.

    0 讨论(0)
  • 2021-02-04 10:02

    It's shorter, cleaner and easier to type and read.

    Try re-read your examples. Just “for each in $dir in first example.

    What you want, that you write…

    0 讨论(0)
  • 2021-02-04 10:03

    To understand the difference between the two, let's write two functions that read contents of a directory into an array - one using the procedural method and the other object oriented:

    Procedural, using opendir/readdir/closedir

    function list_directory_p($dirpath) {
        if (!is_dir($dirpath) || !is_readable($dirpath)) {
            error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
            return null;
        }
        $paths = array();
        $dir = realpath($dirpath);
        $dh = opendir($dir);
        while (false !== ($f = readdir($dh))) {
            if ("$f" != '.' && "$f" != '..') {
                $paths[] = "$dir" . DIRECTORY_SEPARATOR . "$f";
            }
        }
        closedir($dh);
        return $paths;
    }
    

    Object Oriented, using DirectoryIterator

    function list_directory_oo($dirpath) {
        if (!is_dir($dirpath) || !is_readable($dirpath)) {
            error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
            return null;
        }
        $paths = array();
        $dir = realpath($dirpath);
        $di = new DirectoryIterator($dir);
        foreach ($di as $fileinfo) {
            if (!$fileinfo->isDot()) {
                $paths[] = $fileinfo->getRealPath();
            }
        }
        return $paths;
    }
    

    Performance

    Let's assess their performance first:

    $start_t = microtime(true);
    for ($i = 0; $i < $num_iterations; $i++) {
        $paths = list_directory_oo(".");
    }
    $end_t = microtime(true);
    $time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
    echo "Time taken per call (list_directory_oo) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";
    
    $start_t = microtime(true);
    for ($i = 0; $i < $num_iterations; $i++) {
        $paths = list_directory_p(".");
    }
    $end_t = microtime(true);
    $time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
    echo "Time taken per call (list_directory_p) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";
    

    On my laptop (Win 7 / NTFS), procedural method seems to be clear winner:

    C:\code>"C:\Program Files (x86)\PHP\php.exe" list_directory.php
    Time taken per call (list_directory_oo) = 4.46ms (161 files)
    Time taken per call (list_directory_p) = 0.34ms (161 files)
    

    On an entry-level AWS machine (CentOS):

    [~]$ php list_directory.php
    Time taken per call (list_directory_oo) = 0.84ms (203 files)
    Time taken per call (list_directory_p) = 0.36ms (203 files)
    

    Above are results on PHP 5.4. You'll see similar results using PHP 5.3 and 5.2. Results are similar when PHP is running on Apache or NGINX.

    Code Readability

    Although slower, code using DirectoryIterator is more readable.

    File reading order

    The order of directory contents read using either method are exact same. That is, if list_directory_oo returns array('h', 'a', 'g'), list_directory_p also returns array('h', 'a', 'g')

    Extensibility

    Above two functions demonstrated performance and readability. Note that, if your code needs to do further operations, code using DirectoryIterator is more extensible.

    e.g. In function list_directory_oo above, the $fileinfo object provides you with a bunch of methods such as getMTime(), getOwner(), isReadable() etc (return values of most of which are cached and do not require system calls).

    Therefore, depending on your use-case (that is, what you intend to do with each child element of the input directory), it's possible that code using DirectoryIterator performs as good or sometimes better than code using opendir.

    You can modify the code of list_directory_oo and test it yourself.

    Summary

    Decision of which to use entirely depends on use-case.

    If I were to write a cronjob in PHP which recursively scans a directory (and it's subdirectories) containing thousands of files and do certain operation on them, I would choose the procedural method.

    But if my requirement is to write a sort of web-interface to display uploaded files (say in a CMS) and their metadata, I would choose DirectoryIterator.

    You can choose based on your needs.

    0 讨论(0)
提交回复
热议问题