How best to read a File into List

后端 未结 10 1222
盖世英雄少女心
盖世英雄少女心 2020-11-27 17:34

I am using a list to limit the file size since the target is limited in disk and ram. This is what I am doing now but is there a more efficient way?

readonly         


        
相关标签:
10条回答
  • 2020-11-27 17:47
    string inLine = reader.ReadToEnd();
    myList = inLine.Split(new string[] { "\r\n" }, StringSplitOptions.None).ToList();
    

    I also use the Environment.NewLine.toCharArray as well, but found that didn't work on a couple files that did end in \r\n. Try either one and I hope it works well for you.

    0 讨论(0)
  • 2020-11-27 17:51

    A little update to Evan Mulawski answer to make it shorter

    List<string> allLinesText = File.ReadAllLines(fileName).ToList()

    0 讨论(0)
  • 2020-11-27 17:55

    [Edit]

    If you are doing this to trim the beginning of a log file, you can avoid loading the entire file by doing something like this:

    // count the number of lines in the file
    int count = 0;
    using (var sr = new StreamReader("file.txt"))
    {
        while (sr.ReadLine() != null) 
            count++;
    }
    
    // skip first (LOG_MAX - count) lines
    count = LOG_MAX - count;
    using (var sr = new StreamReader("file.txt"))
    using (var sw = new StreamWriter("output.txt"))
    {
        // skip several lines
        while (count > 0 && sr.ReadLine() != null) 
            count--;
    
        // continue copying
        string line = "";
        while ((line = sr.ReadLine()) != null)
            sw.WriteLine(line);
    }
    

    First of all, since File.ReadAllLines loads the entire file into a string array (string[]), copying to a list is redundant.

    Second, you must understand that a List is implemented using a dynamic array under the hood. This means that CLR will need to allocate and copy several arrays until it can accommodate the entire file. Since the file is already on disk, you might consider trading speed for memory and working on disk data directly, or processing it in smaller chunks.

    1. If you need to load it entirely in memory, at least try to leave in an array:

       string[] lines = File.ReadAllLines("file.txt");
      
    2. If it really needs to be a List, load lines one by one:

       List<string> lines = new List<string>();
       using (var sr = new StreamReader("file.txt"))
       {
            while (sr.Peek() >= 0)
                lines.Add(sr.ReadLine());
       }
      

      Note: List<T> has a constructor which accepts a capacity parameter. If you know the number of lines in advance, you can prevent multiple allocations by preallocating the array in advance:

       List<string> lines = new List<string>(NUMBER_OF_LINES);
      
    3. Even better, avoid storing the entire file in memory and process it "on the fly":

       using (var sr = new StreamReader("file.txt"))
       {
            string line;
            while ((line = sr.ReadLine()) != null) 
            {
                // process the file line by line
            }
       }
      
    0 讨论(0)
  • 2020-11-27 17:56
    string inLine = reader.ReadToEnd();
    myList = inLine.Split(new string[] { "\r\n" }, StringSplitOptions.None).ToList();
    

    This answer misses the original point, which was that they were getting an OutOfMemory error. If you proceed with the above version, you are sure to hit it if your system does not have the appropriate CONTIGUOUS available ram to load the file.

    You simply must break it into parts, and either store as List or String[] either way.

    0 讨论(0)
  • 2020-11-27 17:57

    Why not use a generator instead?

    private IEnumerable<string> ReadLogLines(string logPath) {
        using(StreamReader reader = File.OpenText(logPath)) {
            string line = "";
            while((line = reader.ReadLine()) != null) {
                yield return line;
            }
        }
    }
    

    Then you can use it like you would use the list:

    var logFile = ReadLogLines(LOG_PATH);
    foreach(var s in logFile) {
        // Do whatever you need
    }
    

    Of course, if you need to have a List<string>, then you will need to keep the entire file contents in memory. There's really no way around that.

    0 讨论(0)
  • 2020-11-27 17:57

    Don't store it if possible. Just read through it if you are memory constrained. You can use a StreamReader:

    using (var reader = new StreamReader("file.txt"))
    {
        var line = reader.ReadLine();
        // process line here
    }
    

    This can be wrapped in a method which yields strings per line read if you want to use LINQ.

    0 讨论(0)
提交回复
热议问题