Split large file into smaller files by number of lines in C#?

一向 2021-02-06 18:58

I am trying to figure out how to split a file by the number of lines in each file. THe files are csv and I can\'t do it by bytes. I need to do it by lines. 20k seems to be a goo

  • 2021-02-06 19:19

    I'd do it like this:

    // helper method to break up into blocks lazily
    public static IEnumerable<ICollection<T>> SplitEnumerable<T>
        (IEnumerable<T> Sequence, int NbrPerBlock)
        List<T> Group = new List<T>(NbrPerBlock);
        foreach (T value in Sequence)
            if (Group.Count == NbrPerBlock)
                yield return Group;
                Group = new List<T>(NbrPerBlock);
        if (Group.Any()) yield return Group; // flush out any remaining
    // now it's trivial; if you want to make smaller files, just foreach
    // over this and write out the lines in each block to a new file
    public static IEnumerable<ICollection<string>> SplitFile(string filePath)
        return File.ReadLines(filePath).SplitEnumerable(20000);

    Is that not sufficient for you? You mention moving from position to position,but I don't see why that's necessary.

  • 2021-02-06 19:29
    using (System.IO.StreamReader sr = new System.IO.StreamReader("path"))
        int fileNumber = 0;
        while (!sr.EndOfStream)
            int count = 0;
            using (System.IO.StreamWriter sw = new System.IO.StreamWriter("other path" + ++fileNumber))
                sw.AutoFlush = true;
                while (!sr.EndOfStream && ++count < 20000)
  • 2021-02-06 19:32
    int index=0;
    var groups = from line in File.ReadLines("myfile.csv")
                 group line by index++/20000 into g
                 select g.AsEnumerable();
    int file=0;
    foreach (var group in groups)
            File.WriteAllLines((file++).ToString(), group.ToArray());
