问题
I have a very large file, almost 2GB in size. I am trying to write a process to read the file in and write it out without the first row. I pretty much have been only able to read and write one line at a time which takes forever. I can open it, remove the first row and save it faster in TextPad, though that is still very slow.
I use this code to get the number of records in the file:
private long getNumRows(string strFileName)
{
long lngNumRows = 0;
string strMsg;
try
{
lngNumRows = 0;
using (var strReader = File.OpenText(@strFileName))
{
while (strReader.ReadLine() != null)
{
lngNumRows++;
}
strReader.Close();
strReader.Dispose();
}
}
catch (Exception excExcept)
{
strMsg = "The File could not be read: ";
strMsg += excExcept.Message;
System.Windows.MessageBox.Show(strMsg);
//Console.WriteLine("Thee was an error reading the file: ");
//Console.WriteLine(excExcept.Message);
//Console.ReadLine();
}
return lngNumRows;
}
This only takes seconds to run. When I add the following code it takes forever to run. Am I doing something wrong? Why does the write add so much time? Any ideas on how I can make this faster?
private void ProcessTextFiles(string strFileName)
{
string strDataLine;
string strFullOutputFileName;
string strSubFileName;
int intPos;
long lngTotalRows = 0;
long lngCurrNumRows = 0;
long lngModNumber = 0;
double dblProgress = 0;
double dblProgressPct = 0;
string strPrgFileName = "";
string strOutName = "";
string strMsg;
long lngFileNumRows;
try
{
using (StreamReader srStreamRdr = new StreamReader(strFileName))
{
while ((strDataLine = srStreamRdr.ReadLine()) != null)
{
lngCurrNumRows++;
if (lngCurrNumRows > 1)
{
WriteDataRow(strDataLine, strFullOutputFileName);
}
}
srStreamRdr.Dispose();
}
}
catch (Exception excExcept)
{
strMsg = "The File could not be read: ";
strMsg += excExcept.Message;
System.Windows.MessageBox.Show(strMsg);
//Console.WriteLine("The File could not be read:");
//Console.WriteLine(excExcept.Message);
}
}
public void WriteDataRow(string strDataRow, string strFullFileName)
{
//using (StreamWriter file = new StreamWriter(@strFullFileName, true, Encoding.GetEncoding("iso-8859-1")))
using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
{
file.WriteLine(strDataRow);
file.Close();
}
}
回答1:
Not sure how much this will improve the performance, but surely, opening and closing the output file for every line that you want to write is not a good idea.
Instead open both files just one time and then write the line directly
using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
using (StreamReader srStreamRdr = new StreamReader(strFileName))
{
while ((strDataLine = srStreamRdr.ReadLine()) != null)
{
lngCurrNumRows++;
if (lngCurrNumRows > 1)
file.WriteLine(strDataRow);
}
}
You could also remove the check on lngCurrNumRow
simply making an empty read before entering the while loop
strDataLine = srStreamRdr.ReadLine();
if(strDataLine != null)
{
while ((strDataLine = srStreamRdr.ReadLine()) != null)
{
file.WriteLine(strDataRow);
}
}
回答2:
Depending on the memory of your machine. You could try the following (my big file was "D:\savegrp.log" I had a 2gb file knocking about) This used about 6gb memory when I tried it
int counter = File.ReadAllLines(@"D:\savegrp.log").Length;
Console.WriteLine(counter);
It does depends on the memory available..
File.WriteAllLines(@"D:\savegrp2.log",File.ReadAllLines(@"D:\savegrp.log").Skip(1));
Console.WriteLine("file saved");
来源:https://stackoverflow.com/questions/37725050/reading-and-writing-very-large-text-files-in-c-sharp