Reading and writing very large text files in C#

随声附和 提交于 2019-12-10 14:39:08

问题


I have a very large file, almost 2GB in size. I am trying to write a process to read the file in and write it out without the first row. I pretty much have been only able to read and write one line at a time which takes forever. I can open it, remove the first row and save it faster in TextPad, though that is still very slow.

I use this code to get the number of records in the file:

private long getNumRows(string strFileName)
{
    long lngNumRows = 0;
    string strMsg;

    try
    {
        lngNumRows = 0;
        using (var strReader = File.OpenText(@strFileName))
        {
            while (strReader.ReadLine() != null)
            {
                lngNumRows++;
            }

            strReader.Close();
            strReader.Dispose();
        }
    }
    catch (Exception excExcept)
    {
        strMsg = "The File could not be read: ";
        strMsg += excExcept.Message;
        System.Windows.MessageBox.Show(strMsg);
        //Console.WriteLine("Thee was an error reading the file: ");
        //Console.WriteLine(excExcept.Message);

        //Console.ReadLine();
    }

    return lngNumRows;
}

This only takes seconds to run. When I add the following code it takes forever to run. Am I doing something wrong? Why does the write add so much time? Any ideas on how I can make this faster?

private void ProcessTextFiles(string strFileName)
{
    string strDataLine;
    string strFullOutputFileName;
    string strSubFileName;
    int intPos;
    long lngTotalRows = 0;
    long lngCurrNumRows = 0;
    long lngModNumber = 0;
    double dblProgress = 0;
    double dblProgressPct = 0;
    string strPrgFileName = "";
    string strOutName = "";
    string strMsg;
    long lngFileNumRows;

    try
    {
       using (StreamReader srStreamRdr = new StreamReader(strFileName))
        {
            while ((strDataLine = srStreamRdr.ReadLine()) != null)
            {
                lngCurrNumRows++;

                if (lngCurrNumRows > 1)
                {
                    WriteDataRow(strDataLine, strFullOutputFileName);
                }
            }

            srStreamRdr.Dispose();
        }
    }
    catch (Exception excExcept)
    {
        strMsg = "The File could not be read: ";
        strMsg += excExcept.Message;
        System.Windows.MessageBox.Show(strMsg);
        //Console.WriteLine("The File could not be read:");
        //Console.WriteLine(excExcept.Message);
    }
}

public void WriteDataRow(string strDataRow, string strFullFileName)
{
    //using (StreamWriter file = new StreamWriter(@strFullFileName, true, Encoding.GetEncoding("iso-8859-1")))
    using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
    {
        file.WriteLine(strDataRow);
        file.Close();
    }
}

回答1:


Not sure how much this will improve the performance, but surely, opening and closing the output file for every line that you want to write is not a good idea.

Instead open both files just one time and then write the line directly

using (StreamWriter file = new StreamWriter(@strFullFileName, true, System.Text.Encoding.UTF8))
using (StreamReader srStreamRdr = new StreamReader(strFileName))
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null)
    {
        lngCurrNumRows++;

        if (lngCurrNumRows > 1)
           file.WriteLine(strDataRow);
    }
}

You could also remove the check on lngCurrNumRow simply making an empty read before entering the while loop

strDataLine = srStreamRdr.ReadLine();
if(strDataLine != null)
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null)
    {
           file.WriteLine(strDataRow);
    }
}



回答2:


Depending on the memory of your machine. You could try the following (my big file was "D:\savegrp.log" I had a 2gb file knocking about) This used about 6gb memory when I tried it

int counter = File.ReadAllLines(@"D:\savegrp.log").Length;
Console.WriteLine(counter);

It does depends on the memory available..

File.WriteAllLines(@"D:\savegrp2.log",File.ReadAllLines(@"D:\savegrp.log").Skip(1));
Console.WriteLine("file saved");


来源:https://stackoverflow.com/questions/37725050/reading-and-writing-very-large-text-files-in-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!