Faster file move method other than File.Move

前端 未结 3 2022
北恋
北恋 2021-02-09 02:39

I have a console application that is going to take about 625 days to complete. Unless there is a way to make it faster.

First off I am working in a directory that has a

相关标签:
3条回答
  • 2021-02-09 03:23

    You can move files in parallel and also using Directory.EnumerateFiles gives you a lazy loaded list of files (of-course I have not tested it with 4,000,000 files):

    var numberOfConcurrentMoves = 2;
    var moves = new List<Task>();
    var sourceDirectory = "source-directory";
    var destinationDirectory = "destination-directory";
    
    foreach (var filePath in Directory.EnumerateFiles(sourceDirectory))
    {
        var move = new Task(() =>
        {
            File.Move(filePath, Path.Combine(destinationDirectory, Path.GetFileName(filePath)));
    
            //UPDATE DB
        }, TaskCreationOptions.PreferFairness);
        move.Start();
    
        moves.Add(move);
    
        if (moves.Count >= numberOfConcurrentMoves)
        {
            Task.WaitAll(moves.ToArray());
            moves.Clear();
        }
    }
    
    Task.WaitAll(moves.ToArray());
    
    0 讨论(0)
  • 2021-02-09 03:26

    It turns out switching from File.Move to setting up a FileInfo and using .MoveTo increased the speed significantly.

    It will run in about 35 days now as opposed to 625 days.

    FileInfo fileinfo = new FileInfo(Path.Combine(location, fileName));
    fileinfo.MoveTo(Path.Combine(rootDir, fileYear, fileMonth, fileName));
    
    0 讨论(0)
  • 2021-02-09 03:36

    18 seconds isn't really unusual. NTFS does not perform well when you have a lot of files in a single directory. When you ask for a file, it has to do a linear search of its directory data structure. With 1,000 files, that doesn't take too long. With 10,000 files you notice it. With 4 million files . . . yeah, it takes a while.

    You can probably do this even faster if you pre-load all of the directory entries into memory. Then rather than calling the FileInfo constructor for each file, you just look it up in your dictionary.

    Something like:

    var dirInfo = new DirectoryInfo(path);
    // get list of all files
    var files = dirInfo.GetFileSystemInfos();
    var cache = new Dictionary<string, FileSystemInfo>();
    foreach (var f in files)
    {
        cache.Add(f.FullName, f);
    }
    

    Now when you get a name from the database, you can just look it up in the dictionary. That might very well be faster than trying to get it from the disk each time.

    0 讨论(0)
提交回复
热议问题