How do I download a large file (via HTTP) in .NET?

前端 未结 6 1126
余生分开走
余生分开走 2021-02-01 18:03

I need to download a large file (2 GB) over HTTP in a C# console application. Problem is, after about 1.2 GB, the application runs out of memory.

Here

相关标签:
6条回答
  • 2021-02-01 18:15

    You need to get the response stream and then read in blocks, writing each block to a file to allow memory to be reused.

    As you have written it, the whole response, all 2GB, needs to be in memory. Even on a 64bit system that will hit the 2GB limit for a single .NET object.


    Update: easier option. Get WebClient to do the work for you: with its DownloadFile method which will put the data directly into a file.

    0 讨论(0)
  • 2021-02-01 18:23

    The connection can be interrupted, so it is better to download the file in small chunks.

    Akka streams can help download file in small chunks from a System.IO.Stream using multithreading. https://getakka.net/articles/intro/what-is-akka.html

    The Download method will append the bytes to the file starting with long fileStart. If the file does not exist, fileStart value must be 0.

    using Akka.Actor;
    using Akka.IO;
    using Akka.Streams;
    using Akka.Streams.Dsl;
    using Akka.Streams.IO;
    
    private static Sink<ByteString, Task<IOResult>> FileSink(string filename)
    {
        return Flow.Create<ByteString>()
            .ToMaterialized(FileIO.ToFile(new FileInfo(filename), FileMode.Append), Keep.Right);
    }
    
    private async Task Download(string path, Uri uri, long fileStart)
    {
        using (var system = ActorSystem.Create("system"))
        using (var materializer = system.Materializer())
        {
           HttpWebRequest request = WebRequest.Create(uri) as HttpWebRequest;
           request.AddRange(fileStart);
    
           using (WebResponse response = request.GetResponse())
           {
               Stream stream = response.GetResponseStream();
    
               await StreamConverters.FromInputStream(() => stream, chunkSize: 1024)
                   .RunWith(FileSink(path), materializer);
           }
        }
    }
    
    0 讨论(0)
  • 2021-02-01 18:26

    The WebClient class is the one for simplified scenarios. Once you get past simple scenarios (and you have), you'll have to fall back a bit and use WebRequest.

    With WebRequest, you'll have access to the response stream, and you'll be able to loop over it, reading a bit and writing a bit, until you're done.


    Example:

    public void MyDownloadFile(Uri url, string outputFilePath)
    {
        const int BUFFER_SIZE = 16 * 1024;
        using (var outputFileStream = File.Create(outputFilePath, BUFFER_SIZE))
        {
            var req = WebRequest.Create(url);
            using (var response = req.GetResponse())
            {
                using (var responseStream = response.GetResponseStream())
                {
                    var buffer = new byte[BUFFER_SIZE];
                    int bytesRead;
                    do
                    {
                        bytesRead = responseStream.Read(buffer, 0, BUFFER_SIZE);
                        outputFileStream.Write(buffer, 0, bytesRead);
                    } while (bytesRead > 0);
                }
            }
        }
    }
    

    Note that if WebClient.DownloadFile works, then I'd call it the best solution. I wrote the above before the "DownloadFile" answer was posted. I also wrote it way too early in the morning, so a grain of salt (and testing) may be required.

    0 讨论(0)
  • 2021-02-01 18:30

    WebClient.OpenRead returns a Stream, just use Read to loop over the contents, so the data is not buffered in memory but can be written in blocks to a file.

    0 讨论(0)
  • 2021-02-01 18:34

    i would use something like this

    0 讨论(0)
  • 2021-02-01 18:36

    If you use WebClient.DownloadFile you could save it directly into a file.

    0 讨论(0)
提交回复
热议问题