Traversing directories without using recursion?

北城余情 提交于 2020-11-29 04:44:59

问题


The Problem I need to write a simple software that, giving certain constraints, appends to a list a series of files. The user could choose between two "types" of directory: one with a * wildcard meaning it should also explore subdirectories and the classic one without wildcards that just get files present in that directory.

What I'm doing

Right now I'm doing the stupidest thing:

import java.io.File;

public class Eseguibile {

    private static void displayIt(File node){

        System.out.println(node.getAbsoluteFile());

        if(node.isDirectory()){
            String[] subNote = node.list();
            for(String filename : subNote){
                displayIt(new File(node, filename));
            }
        }
    }

    public static void main(String[] args){
        System.out.println("ciao");

        displayIt( new File("/home/dierre/") );

    }

}

I do not need to build a tree because I just need the files list so I was thinking maybe there's a more efficient way to do it.

I was reading about the TreeModel but, as I understand it, it's just an interface to implement a Jtree.


回答1:


Right now I'm doing the stupidest thing ...

Recursion is neither "stupid" or necessarily inefficient. Indeed in this particular case, a recursive solution is likely to be more efficient than a non-recursive one. And of course, the recursive solution is easier to code and to understand than the alternatives.

The only potential problem with recursion is that you could overflow the stack if the directory tree is pathologically deep.

If you really want to avoid recursion, then the natural way to do it is to use a "stack of list of File" data structure. Each place where you would have recursed, you push the list containing the current directory's (remaining) File objects onto the stack, read the new directory and start working on them. Then when you are finished, pop the stack and continue with the parent directory. This will give you a depth-first traversal. If you want a breadth-first traversal, use a "queue of File" data structure instead of a stack.




回答2:


My iterative solution:

  ArrayDeque<File> stack = new ArrayDeque<File>();

    stack.push(new File("<path>"));

    int n = 0;

    while(!stack.isEmpty()){

        n++;
        File file = stack.pop();

        System.err.println(file);

        File[] files = file.listFiles();

        for(File f: files){

            if(f.isHidden()) continue;

            if(f.isDirectory()){
                stack.push(f);
                continue;
            }

            n++;
            System.out.println(f);


        }

    }

    System.out.println(n);



回答3:


@Stephen C: Below as per your request, my benchmark code that I talked about in the comments (C# - not Java).
Note that it should use stopwatch instead of datetime for better accuracy, but otherwise it's fine.
I didn't test if the iteration delivers the same number files as the recursion, but it should.

Actually, if you pay attention to the median, you'll notice that this already starts showing with only very few files. (my Desktop folder contains 2210 files, 415 folders, 3.2 GB total, most of it large files in the download folder, AppData, and a higher number of files due to one larger C# project [mail-server] on my desktop).

To get the numbers I talked about in the comment, install cygwin (with everything [that's about 100GB, I think] ), and index the cygwin folder.

Speed comparison

As mentioned in the comments, it is not entirely correct to say it doesn't matter.

While for a small directory tree, recursion is negligibly more efficient than iteration (in the order of several 10s of milliseconds), for a very large tree, recursion is minutes (and therefore noticeably) slower than iteration. It isn't all to difficult to grasp why either. If you have to allocate and return a new set of stack variables, each time, call a function, and store all previous results until you return, you're of course slower than when you initiate a a stack-structure on the heap once, and use that for each iteration.

The tree doesn't need to be pathologically deep for this effect to be noticed (while slow speed isn't a stack overflow, its very negative consequences are not much different from a StackOverflow-Bug). Also I wouldn't call having a lot of files "pathological", because if you do an index on your main drive, you'll naturally have a lot of files. Have some HTML documentation, and the number of files explodes. You'll find that on a whole lot of files, an iteration completes in less than 30 seconds, while a recursion needs appx. 3 minutes.

using System;
using System.Data;
using System.Linq;


namespace IterativeDirectoryCSharp
{


    public class SearchStrategy
    {


        //Iterative File and Folder Listing in VB.NET
        public static bool IterativeSearch2(string strPath)
        {
            System.IO.DirectoryInfo dirInfo = new System.IO.DirectoryInfo(strPath);
            System.IO.FileSystemInfo[] arrfsiEntities = null;
            arrfsiEntities = dirInfo.GetFileSystemInfos();


            // Creates and initializes a new Stack.
            System.Collections.Stack strLastPathStack = new System.Collections.Stack();
            System.Collections.Stack iIndexStack = new System.Collections.Stack();

            int iIndex = 0;
            int iMaxEntities = arrfsiEntities.Length;
            do
            {
                while (iIndex < iMaxEntities)
                {

                    if (arrfsiEntities[iIndex].Attributes == System.IO.FileAttributes.Directory)
                    {
                        //Console.WriteLine("Searching directory " + arrfsiEntities[iIndex].FullName);

                        strLastPathStack.Push(System.IO.Directory.GetParent(arrfsiEntities[iIndex].FullName).FullName);
                        strLastPathStack.Push(arrfsiEntities[iIndex].FullName);
                        iIndexStack.Push(iIndex);

                        dirInfo = null;
                        Array.Clear(arrfsiEntities, 0, arrfsiEntities.Length);
                        dirInfo = new System.IO.DirectoryInfo(strLastPathStack.Pop().ToString());
                        arrfsiEntities = dirInfo.GetFileSystemInfos();

                        iIndex = 0;
                        iMaxEntities = arrfsiEntities.Length;
                        continue;
                    }
                    else
                    {
                        //Console.WriteLine(arrfsiEntities[iIndex].FullName);
                    }

                    iIndex += 1;
                } // Whend


                dirInfo = null;
                Array.Clear(arrfsiEntities, 0, arrfsiEntities.Length);
                // Dont try to do move the next line in the loop while/until statement, null reference when pop on an empty stack...
                if (strLastPathStack.Count == 0) 
                    break;

                dirInfo = new System.IO.DirectoryInfo(strLastPathStack.Pop().ToString());
                arrfsiEntities = dirInfo.GetFileSystemInfos();

                iIndex = (int)iIndexStack.Pop() + 1;
                iMaxEntities = arrfsiEntities.Length;
            } // End do
            while (true);

            return true;
        } // End Function IterativeSearch2


        public static bool IterativeSearch1(string path)
        {

            System.IO.DirectoryInfo dirInfo = new System.IO.DirectoryInfo(path);
            System.IO.FileSystemInfo[] arrfsiEntities = null;
            arrfsiEntities = dirInfo.GetFileSystemInfos();


            // Creates and initializes a new Stack.
            System.Collections.Stack myStack = new System.Collections.Stack();
            //Console.WriteLine("Stack is empty when \t stack.count={0}", myStack.Count);


            int iIndex = 0;
            int iMaxEntities = arrfsiEntities.Length - 1;

            do
            {
                for (iIndex = 0; iIndex <= iMaxEntities; iIndex += 1)
                {
                    if (arrfsiEntities[iIndex].Attributes == System.IO.FileAttributes.Directory)
                    {
                        //Console.WriteLine("Searching directory " + arrfsiEntities[iIndex].FullName);
                        myStack.Push(arrfsiEntities[iIndex].FullName);
                    }
                    else
                    {
                        //Console.WriteLine("{0}", arrfsiEntities[iIndex].FullName);
                    }
                } // Next iIndex

                dirInfo = null;
                Array.Clear(arrfsiEntities, 0, arrfsiEntities.Length);
                // Dont try to do move the next line in the loop while/until statement, null reference when pop on an empty stack...
                if (myStack.Count == 0) 
                    break;

                dirInfo = new System.IO.DirectoryInfo(myStack.Pop().ToString());
                arrfsiEntities = dirInfo.GetFileSystemInfos();

                iIndex = 0;
                iMaxEntities = arrfsiEntities.Length - 1;
            }
            while (true);

            return true;
        } // End Function IterativeSearch1


        //Recursive File and Folder Listing VB.NET
        public static bool RecursiveSearch(string path)
        {
            System.IO.DirectoryInfo dirInfo = new System.IO.DirectoryInfo(path);
            //System.IO.FileSystemInfo fileObject = default(System.IO.FileSystemInfo);
            foreach (System.IO.FileSystemInfo fsiThisEntityInfo in dirInfo.GetFileSystemInfos())
            {

                if (fsiThisEntityInfo.Attributes == System.IO.FileAttributes.Directory)
                {
                    //Console.WriteLine("Searching directory " + fsiThisEntityInfo.FullName);
                    RecursiveSearch(fsiThisEntityInfo.FullName);
                }
                else
                {
                    //Console.WriteLine(fsiThisEntityInfo.FullName);
                }

            } // Next fiThisFileInfo

            return true;
        } // End Function RecursiveSearch


        // http://forums.asp.net/p/1414929/3116196.aspx
        public class TimeFrame
        {
            public DateTime dtStartTime;
            public DateTime dtEndTime;

            public TimeFrame(DateTime dtStart, DateTime dtEnd)
            {
                this.dtStartTime = dtStart;
                this.dtEndTime = dtEnd;
            } // End Constructor

        } // End Class TimeFrame



        // Small amount of files 
        //          Iter1       Iter2       Recurs.
        // Median   1206.231    3910.367    1232.4079
        // Average  1216.431647 3940.147975 1239.092354
        // Minimum  1172.5827   3832.0984   1201.2308
        // Maximum  1393.4091   4400.4237   1440.3386

        public static System.Data.DataTable TestStrategies(string strDirectoryToSearch)
        {
            System.Data.DataTable dt = new System.Data.DataTable();

            System.Collections.Generic.Dictionary<int, string> dictResults = new System.Collections.Generic.Dictionary<int, string>();


            dt.Columns.Add("TestRun", typeof(string));
            dt.Columns.Add("IterativeSearch1", typeof(double));
            dt.Columns.Add("IterativeSearch2", typeof(double));
            dt.Columns.Add("RecursiveSearch", typeof(double));


            System.Data.DataRow dr = null;
            System.Collections.Generic.Dictionary<string, TimeFrame> dictPerformance = null;
            for (int i = 0; i < 100; ++i)
            {
                dr = dt.NewRow();
                dr["TestRun"] = i + 1;

                dictPerformance = new System.Collections.Generic.Dictionary<string, TimeFrame>();
                DateTime startTime;
                DateTime endTime;
                Console.WriteLine("*********************************************************");

                startTime = DateTime.Now;
                IterativeSearch1(strDirectoryToSearch);
                endTime = DateTime.Now;
                dictPerformance.Add("IterativeSearch1", new TimeFrame(startTime, endTime));

                Console.WriteLine("*********************************************************");

                startTime = DateTime.Now;
                IterativeSearch2(strDirectoryToSearch);
                endTime = DateTime.Now;
                dictPerformance.Add("IterativeSearch2", new TimeFrame(startTime, endTime));

                Console.WriteLine("*********************************************************");

                startTime = DateTime.Now;
                RecursiveSearch(strDirectoryToSearch);
                endTime = DateTime.Now;
                dictPerformance.Add("RecursiveSearch", new TimeFrame(startTime, endTime));

                Console.WriteLine("*********************************************************");

                string strResult = "";
                foreach (string strKey in dictPerformance.Keys)
                {
                    TimeSpan elapsedTime = dictPerformance[strKey].dtEndTime - dictPerformance[strKey].dtStartTime;

                    dr[strKey] = elapsedTime.TotalMilliseconds;
                    strResult += strKey + ": " + elapsedTime.TotalMilliseconds.ToString() + Environment.NewLine;
                } // Next

                //Console.WriteLine(strResult);    
                dictResults.Add(i, strResult);
                dt.Rows.Add(dr);
            } // Next i


            foreach(int iMeasurement in dictResults.Keys)
            {
                Console.WriteLine("Measurement " + iMeasurement.ToString());
                Console.WriteLine(dictResults[iMeasurement]);
                Console.WriteLine(Environment.NewLine);
            } // Next iMeasurement


            double[] adblIterSearch1 = dt
                 .AsEnumerable()
                 .Select(row => row.Field<double>("IterativeSearch1"))
                 .ToArray();

            double[] adblIterSearch2 = dt
                 .AsEnumerable()
                 .Select(row => row.Field<double>("IterativeSearch2"))
                 .ToArray();

            double[] adblRecursiveSearch = dt
                 .AsEnumerable()
                 .Select(row => row.Field<double>("RecursiveSearch"))
                 .ToArray(); 



            dr = dt.NewRow();
            dr["TestRun"] = "Median";
            dr["IterativeSearch1"] = Median<double>(adblIterSearch1);
            dr["IterativeSearch2"] = Median<double>(adblIterSearch2);
            dr["RecursiveSearch"] = Median<double>(adblRecursiveSearch);
            dt.Rows.Add(dr);


            dr = dt.NewRow();
            dr["TestRun"] = "Average";
            dr["IterativeSearch1"] = dt.Compute("Avg(IterativeSearch1)", string.Empty);
            dr["IterativeSearch2"] = dt.Compute("Avg(IterativeSearch2)", string.Empty);
            dr["RecursiveSearch"] = dt.Compute("Avg(RecursiveSearch)", string.Empty);
            dt.Rows.Add(dr);


            dr = dt.NewRow();
            dr["TestRun"] = "Minimum ";
            dr["IterativeSearch1"] = dt.Compute("Min(IterativeSearch1)", string.Empty);
            dr["IterativeSearch2"] = dt.Compute("Min(IterativeSearch2)", string.Empty);
            dr["RecursiveSearch"] = dt.Compute("Min(RecursiveSearch)", string.Empty);
            dt.Rows.Add(dr);

            dr = dt.NewRow();
            dr["TestRun"] = "Maximum ";
            dr["IterativeSearch1"] = dt.Compute("Max(IterativeSearch1)", string.Empty);
            dr["IterativeSearch2"] = dt.Compute("Max(IterativeSearch2)", string.Empty);
            dr["RecursiveSearch"] = dt.Compute("Max(RecursiveSearch)", string.Empty);
            dt.Rows.Add(dr);

            return dt;
        } // End Sub TestMain


        public static double Median<T>(T[] numbers)
        {
            int numberCount = numbers.Count();

            if (numberCount == 0)
                return 0.0;

            int halfIndex = numbers.Count() / 2;
            var sortedNumbers = numbers.OrderBy(n => n);
            double median;

            if ((numberCount % 2) == 0)
            {
                median = 
                    (
                        (
                            System.Convert.ToDouble(sortedNumbers.ElementAt<T>(halfIndex)) +
                            System.Convert.ToDouble(sortedNumbers.ElementAt<T>((halfIndex - 1))
                        )
                    ) / 2);
            }
            else
            {
                median = System.Convert.ToDouble(sortedNumbers.ElementAt<T>(halfIndex));
            }

            return median;
        } // End Function GetMedian


        // http://msmvps.com/blogs/deborahk/archive/2010/05/07/linq-mean-median-and-mode.aspx
        public static double CalcMedian(int[] numbers)
        {
            int numberCount = numbers.Count();
            int halfIndex = numbers.Count() / 2;
            var sortedNumbers = numbers.OrderBy(n => n);
            double median;

            if ((numberCount % 2) == 0)
            {
                median = ((sortedNumbers.ElementAt(halfIndex) +
                    sortedNumbers.ElementAt((halfIndex - 1))) / 2);
            }
            else
            {
                median = sortedNumbers.ElementAt(halfIndex);
            }

            return median;
        } // End Function CalcMedian


    } // End Class SearchStrategy


} // End Namespace IterativeDirectoryCSharp



回答4:


Recursion can always be transformed into a loop.
A quick and dirty possible solution (not tested) follows :

private static void displayDirectory(File node){
    ArraList directories = new ArrayList();
    if (node.isDirectory())
       directories.append (node);    
    // Iterate over the directory list
    Iterator it = directories.iterator();
    while(it.hasNext()){
       File dir  = (File)it.next();           
       // get childs
       String[] subNote = dir.list();
       for(String filename : subNote){
          subNode = new File(node, filename);
          // display current child name
          System.out.println(subNode.getAbsoluteFile());
          // if directory : add current child to the list of dir to process
          if (subnode.isDirectory()){
             directories.append(subNode);
          }
       }
    }
}

please note that the source node should be a directory for anything to be displayed.
Also, this is a breadth-first display. if you want a depth first, you should change the "append" to put the file it just after the current node in the array list.

i'm not sure about the memory consomation, however.
Regards
Guillaume




回答5:


I am a real novice, but after working for a week on this problem... I have a clean solution... thanks for all the help from PATRY and etbal.

public class Recur {
    // Process only directories under dir

File dir;
static DirectoryComponent allDirectory;

    public Recur(File dir, DirectoryComponent allDirectory) {
    // TODO Auto-generated constructor stub
        this.dir = dir;
}

    public static DirectoryComponent Recur(File dir, DirectoryComponent allDirectory) {
        String file;
        String path;

         File firstDir = new File(dir.getPath());
         file = firstDir.getName();
         path = firstDir.getPath();

        if (dir.isDirectory()) {

            file = firstDir.getName();
            path = firstDir.getPath();
            DirectoryComponent directory = new Directory(file, path);
            allDirectory.add(directory);

            String [] subNote = dir.list();
            for(String filename : subNote){
                File subNode = new File(firstDir, filename);

                // store current child name

                    file = subNode.getName();
                    path = subNode.getPath();
                    directory.add(new FileItem(file,path));         
            }

            String[] children = dir.list();
            for (int i=0; i<children.length; i++) {
                Recur(new File(dir, children[i]), allDirectory);
            }
        }

        return allDirectory;
    }
}



回答6:


If you choose to use recursion, I found an example that may be close to the one you are currently using as to eliminate any ambiguity .

// Process only directories under dir
public static void visitAllDirs(File dir) {
    if (dir.isDirectory()) {
        process(dir);

        String[] children = dir.list();
        for (int i=0; i<children.length; i++) {
            visitAllDirs(new File(dir, children[i]));
        }
    }
}

This is a very simple example, the process() can be where you do your handling or operations on the directory.




回答7:


PARTY, thank you for advice. I transformed a little bit your code, that's what I've got

private ArrayList<File> displayDirectory(File node){
ArrayList<File> FileList = new ArrayList();
ArrayList <File>directories = new <File>ArrayList();
if (node.isDirectory())
   directories.add(node);
// Iterate over the directory list
Iterator it = directories.iterator();
for (int i = 0 ; i < directories.size();i++){
   File dir  =  directories.get(i);
   // get childs
   String[] subNode = dir.list();
   for(int j = 0 ; j < subNode.length;j++){
      File F = new File( directories.get(i).getAbsolutePath(), subNode[j]);
      // display current child name
    //  System.out.println(F.getAbsoluteFile());
      // if directory : add current child to the list of dir to process
      if (F.isDirectory()) directories.add(F);           
        else   FileList.add(F);
      }
}   
return FileList;
}



回答8:


Based on PATRY Guillaume solution

public static List<File> getFolderTree(File node)
  {
    List<File> directories = new ArrayList();
    if (node.isDirectory())
    {
      directories.add(node);
    }

    for(int i=0; i<directories.size(); i++)
    {
      File dir = directories.get(i);
      String[] subNote = dir.list();
      for (String filename : subNote)
      {
        File subNode = new File(dir, filename);

        if (subNode.isDirectory())
        {
          directories.add(subNode);
        }
      }
    }
    return directories;
  }


来源:https://stackoverflow.com/questions/3932382/traversing-directories-without-using-recursion

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!