Read Big TXT File, Out of Memory Exception

前端 未结 4 1671
孤独总比滥情好
孤独总比滥情好 2020-11-29 12:19

I want to read big TXT file size is 500 MB, First I use

var file = new StreamReader(_filePath).ReadToEnd();  
var lines = file.Split(new[] { \'\\n\' });


        
相关标签:
4条回答
  • 2020-11-29 12:28

    The cause of exception seem to be growing _lines collection but not reading big file. You are reading line and adding to some collection _lines which will be taking memory and causing out of memory execption. You can apply filters to only put the required lines to _lines collection.

    0 讨论(0)
  • 2020-11-29 12:32

    Edit:

    loading the whole file in memory will be causing objects to grow, and .net will throw OOM exceptions if it cannot allocate enough contiguous memory for an object.

    The answer is still the same, you need to stream the file, not read the entire contents. That may require a rearchitecture of your application, however using IEnumerable<> methods you can stack up business processes in different areas of the applications and defer processing.


    A "powerful" machine with 8GB of RAM isn't going to be able to store a 500GB file in memory, as 500 is bigger than 8. (plus you don't get 8 as the operating system will be holding some, you can't allocate all memory in .Net, 32-bit has a 2GB limit, opening the file and storing the line will hold the data twice, there is an object size overhead....)

    You can't load the whole thing into memory to process, you will have to stream the file through your processing.

    0 讨论(0)
  • 2020-11-29 12:41

    You have to count the lines first. It is slower, but you can read up to 2,147,483,647 lines.

    int intNoOfLines = 0;
    using (StreamReader oReader = new 
    StreamReader(MyFilePath))
    {
        while (oReader.ReadLine() != null) intNoOfLines++;
    }
    string[] strArrLines = new string[intNoOfLines];
    int intIndex = 0;
    using (StreamReader oReader = new 
    StreamReader(MyFilePath))
    {
        string strLine;
        while ((strLine = oReader.ReadLine()) != null)
        {
           strArrLines[intIndex++] = strLine;
        }
    }
    
    0 讨论(0)
  • 2020-11-29 12:49

    Just use File.ReadLines which returns an IEnumerable<string> and doesn't load all the lines at once to the memory.

    foreach (var line in File.ReadLines(_filePath))
    {
        //Don't put "line" into a list or collection.
        //Just make your processing on it.
    }
    
    0 讨论(0)
提交回复
热议问题