how to get file parallel using HttpWebRequest

前端 未结 2 464
北海茫月
北海茫月 2021-02-08 14:38

I\'m trying to make a program like IDM, that can download parts of the file simultaneously.
The tool i\'m using to achieve this is TPL in C# .Net4.5
But I\'m having a

2条回答
  •  无人及你
    2021-02-08 15:27

    OK, here's how I would do what you're attempting. This is basically the same idea, just implemented differently.

    public static void DownloadFileInPiecesAndSave()
    {
        //test
        var uri = new Uri("http://www.w3.org/");
    
        var bytes = DownloadInPieces(uri, 4);
        File.WriteAllBytes(@"c:\temp\RangeDownloadSample.html", bytes);
    }
    
    /// 
    /// Donwload a file via HTTP in multiple pieces using a Range request.
    /// 
    public static byte[] DownloadInPieces(Uri uri, uint numberOfPieces)
    {
        //I'm just fudging this for expository purposes. In reality you would probably want to do a HEAD request to get total file size.
        ulong totalFileSize = 1003; 
    
        var pieceSize = totalFileSize / numberOfPieces;
    
        List> tasks = new List>();
        for (uint i = 0; i < numberOfPieces; i++)
        {
            var start = i * pieceSize;
            var end = start + (i == numberOfPieces - 1 ? pieceSize + totalFileSize % numberOfPieces : pieceSize);
            tasks.Add(DownloadFilePiece(uri, start, end));
        }
    
        Task.WaitAll(tasks.ToArray());
    
        //This is probably not the single most efficient way to combine byte arrays, but it is succinct...
        return tasks.SelectMany(t => t.Result).ToArray();
    }
    
    private static async Task DownloadFilePiece(Uri uri, ulong rangeStart, ulong rangeEnd)
    {
        try
        {
            var request = (HttpWebRequest)WebRequest.Create(uri);
            request.AddRange((long)rangeStart, (long)rangeEnd);
            request.Proxy = WebProxy.GetDefaultProxy();
    
            using (var response = await request.GetResponseAsync())
            using (var responseStream = response.GetResponseStream())
            using (var memoryStream = new MemoryStream((int)(rangeEnd - rangeStart)))
            {
                await responseStream.CopyToAsync(memoryStream);
                return memoryStream.ToArray();
            }
        }
        catch (WebException wex)
        {
            //Do lots of error handling here, lots of things can go wrong
            //In particular watch for 416 Requested Range Not Satisfiable
            return null;
        }
        catch (Exception ex)
        {
            //handle the unexpected here...
            return null;
        }
    }
    

    Note that I glossed over a lot of stuff here, such as:

    • Detecting if the server supports range requests. If it doesn't then the server will return the entire content in each request, and we'll get several copies of it.
    • Handling any sort of HTTP errors. What if the third request fails?
    • Retry logic
    • Timeouts
    • Figuring out how big the file actually is
    • Checking whether the file is big enough to warrant multiple requests, and if so how many? It's probably not worth doing this in parallel for files under 1 or 2 MB, but you'd have to test
    • Most likely a bunch of other stuff.

    So you've got a long way to go before I would use this in production. But it should give you an idea of where to start.

提交回复
热议问题