Checking if folder has files

前端 未结 6 1135
清歌不尽
清歌不尽 2021-02-10 09:58

I have program which writes to database which folders are full or empty. Now I\'m using

bool hasFiles=false;
(Directory.GetFiles(path).Length >0) ? hasFiles=         


        
6条回答
  •  心在旅途
    2021-02-10 10:25

    The key to speeding up such a cross-network search is to cut down the number of requests across the network. Rather than getting all the directories, and then checking each for files, try and get everything from one call.

    In .NET 3.5 there is no one method to recursively get all files and folders, so you have to build it yourself (see below). In .NET 4 new overloads exist to to this in one step.

    Using DirectoryInfo one also gets information on whether the returned name is a file or directory, which cuts down calls as well.

    This means splitting a list of all the directories and files becomes something like this:

    struct AllDirectories {
      public List DirectoriesWithoutFiles { get; set; }
      public List DirectoriesWithFiles { get; set; }
    }
    
    static class FileSystemScanner {
      public AllDirectories DivideDirectories(string startingPath) {
        var startingDir = new DirectoryInfo(startingPath);
    
        // allContent IList
        var allContent = GetAllFileSystemObjects(startingDir);
        var allFiles = allContent.Where(f => !(f.Attributes & FileAttributes.Directory))
                                 .Cast();
        var dirs = allContent.Where(f => (f.Attributes & FileAttributes.Directory))
                             .Cast();
        var allDirs = new SortedList(dirs, new FileSystemInfoComparer());
    
        var res = new AllDirectories {
          DirectoriesWithFiles = new List()
        };
        foreach (var file in allFiles) {
          var dirName = Path.GetDirectoryName(file.Name);
          if (allDirs.Remove(dirName)) {
            // Was removed, so first time this dir name seen.
            res.DirectoriesWithFiles.Add(dirName);
          }
        }
        // allDirs now just contains directories without files
        res.DirectoriesWithoutFiles = new List(addDirs.Select(d => d.Name));
      }
    
      class FileSystemInfoComparer : IComparer {
        public int Compare(FileSystemInfo l, FileSystemInfo r) {
          return String.Compare(l.Name, r.Name, StringComparison.OrdinalIgnoreCase);
        }
      }
    }
    

    Implementing GetAllFileSystemObjects depends on the .NET version. On .NET 4 it is very easy:

    ILIst GetAllFileSystemObjects(DirectoryInfo root) {
      return root.GetFileSystemInfos("*.*", SearchOptions.AllDirectories);
    }
    

    On earlier versions a little more work is needed:

    ILIst GetAllFileSystemObjects(DirectoryInfo root) {
      var res = new List();
      var pending = new Queue(new [] { root });
    
      while (pending.Count > 0) {
        var dir = pending.Dequeue();
        var content = dir.GetFileSystemInfos();
        res.AddRange(content);
        foreach (var dir in content.Where(f => (f.Attributes & FileAttributes.Directory))
                                   .Cast()) {
          pending.Enqueue(dir);
        }
      }
    
      return res;
    }
    

    This approach calls into the filesystem as few times as possible, just once on .NET 4 or once per directory on earlier versions, allowing the network client and server to minimise the number of underlying filesystem calls and network round trips.

    Getting FileSystemInfo instances has the disadvantage of needing multiple file system operations (I believe this is somewhat OS dependent), but for each name any solution needs to know if it is a file or directory so this is not avoidable at some level (without resorting to P/Invoke of FindFileFirst/FindNextFile/FindClose).


    Aside, the above would be easier with a partition extension method:

    Tuple,IEnumerable> Extensions.Partition(
                                                     this IEnumerable input,
                                                     Func parition);
    

    Writing that to be lazy would be an interesting exercise (only consuming input when something iterates over one of the outputs, while buffering the other).

提交回复
热议问题