问题
I am using the Renci SSH.NET package version 2016. I am downloading files from an external server. I usually can download about one file every 6 seconds which is bad when you have thousands of files. I recently tried to change the foreach
loops to Parallel.ForEach
. Doing that changed the files downloaded times to 1.5 seconds. Except when I checked the files they all had 0 KB's so it did not download anything. Is there anything wrong with the parallel loop? I am new to C# and trying to improve download times
Parallel.ForEach(summary.RemoteFiles, (f, loopstate) =>
{
//Are we still connected? If not, reestablish a connection for up to a max of "MaxReconnectAttempts"
if (!sftp.IsConnected)
{
int maxAttempts = Convert.ToInt32(ConfigurationManager.AppSettings["MaxReconnectAttempts"]);
StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service has been connected from remote system, attempting to reconnect (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt 1 of " + maxAttempts.ToString() + ")", Location = locationName });
for (int attempts = 1; attempts <= maxAttempts; attempts++)
{
sftp.Connect();
if (sftp.IsConnected)
{
StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service - Connection reestablished (" + remotePath + ")", Location = locationName });
break;
}
else
{
if ((attempts + 1) <= maxAttempts)
{
StatusUpdate(this, new Types.StatusUpdateEventArgs() { message = "SFTP Service still disconnected from remote system, preparing another reconnect attempt (" + sftpConnInfo.Host + ":" + sftpConnInfo.Port.ToString() + remotePath + " - Attempt " + (attempts + 1).ToString() + " of " + maxAttempts.ToString() + ")", Location = locationName });
System.Threading.Thread.Sleep(2000);
}
else
{
//Max reconnect attempts reached - end the session and ensure the appropriate "failure" workflow is triggered
connectionLost = true;
}
}
}
}
if (connectionLost)
loopstate.Break();
// break;
totalFileCount++;
try
{
if (!System.IO.File.Exists(localSaveLocation + f.FileName))
{
System.Diagnostics.Debug.WriteLine("\tDownloading file " + totalFileCount.ToString() + "(" + f.FileName + ")");
System.IO.Stream localFile = System.IO.File.OpenWrite(localSaveLocation + f.FileName);
//Log remote file name, local file name, date/time start
start = DateTime.Now;
sftp.DownloadFile(f.FullName, localFile);
end = DateTime.Now;
//Log remote file name, local file name, date/time complete (increment the "successful" downloads by 1)
timeElapsed = end.Subtract(start);
runningSeconds += timeElapsed.TotalSeconds;
runningAvg = runningSeconds / Convert.ToDouble(totalFileCount);
estimatedSecondsRemaining = (summary.RemoteFiles.Count - totalFileCount) * runningAvg;
elapsedTimeString = timeElapsed.TotalSeconds.ToString("#.####") + " seconds";
System.Diagnostics.Debug.WriteLine("\tCompleted downloading file in " + elapsedTimeString + " " + "(" + f.FileName + ")");
downloadedFileCount++;
ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = timeElapsed.TotalSeconds, fileName = f.FileName, fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });
f.FileDownloaded = true;
if (deleteAfterDownload)
sftp.DeleteFile(f.FullName);
}
else
{
System.Diagnostics.Debug.WriteLine("\tFile " + totalFileCount.ToString() + "(" + f.FileName + ") already exists locally");
downloadedFileCount++;
ProcessFileComplete(this, new Types.ProcessFileCompleteEventArgs() { downloadSuccessful = true, elapsedTime = 0, fileName = f.FileName + " (File already exists locally)", fullLocalPath = localSaveLocation + f.FileName, Location = locationName, FilesDownloaded = totalFileCount, FilesRemaining = (summary.RemoteFiles.Count - totalFileCount), AvgSecondsPerDownload = runningAvg, TotalSecondsElapsed = runningSeconds, EstimatedTimeRemaining = TimeSpan.FromSeconds(estimatedSecondsRemaining) });
f.FileDownloaded = true;
if (deleteAfterDownload)
sftp.DeleteFile(f.FullName);
}
}
catch (System.Exception ex)
{
// We log stuff here
}
});
回答1:
I cannot tell why you get empty file. Though I'd suspect the fact that you do not close the localFile
stream.
Though, even if your code worked, you will get hardly any performance benefit if you use the same connection for the downloads, as SFTP transfers tend to be limited by a network latency or CPU. You have to use multiple connections to overcome that.
See my answer on Server Fault about factors that affect SFTP transfer speed.
Implement some connection pool and pick a free connection each time.
Simple example:
var clients = new ConcurrentBag<SftpClient>();
Parallel.ForEach(files, (f, loopstate) => {
if (!clients.TryTake(out var client))
{
client = new SftpClient(hostName, userName, password);
client.Connect();
}
string localPath = Path.Combine(destPath, f.Name);
Console.WriteLine(
"Thread {0}, Connection {1}, File {2} => {3}",
Thread.CurrentThread.ManagedThreadId, client.GetHashCode(),
f.FullName, localPath);
using (var stream = File.Create(localPath))
{
client.DownloadFile(f.FullName, stream);
}
clients.Add(client);
});
Console.WriteLine("Closing {0} connections", clients.Count);
foreach (var client in clients)
{
client.Dispose();
}
You should limit number of connections somehow though.
Another approach is to start a fixed number of threads with one connection for each and have them pick files from a queue.
For an example of implementation see my article for WinSCP .NET assembly:
Automating transfers in parallel connections over SFTP/FTP protocol
来源:https://stackoverflow.com/questions/48833005/processing-sftp-files-using-c-sharp-parallel-foreach-loop-not-processing-downloa