Split large file into smaller files by number of lines in C#?

前端 未结 3 1200
一向
一向 2021-02-06 18:58

I am trying to figure out how to split a file by the number of lines in each file. THe files are csv and I can\'t do it by bytes. I need to do it by lines. 20k seems to be a goo

相关标签:
3条回答
  • 2021-02-06 19:19

    I'd do it like this:

    // helper method to break up into blocks lazily
    
    public static IEnumerable<ICollection<T>> SplitEnumerable<T>
        (IEnumerable<T> Sequence, int NbrPerBlock)
    {
        List<T> Group = new List<T>(NbrPerBlock);
    
        foreach (T value in Sequence)
        {
            Group.Add(value);
    
            if (Group.Count == NbrPerBlock)
            {
                yield return Group;
                Group = new List<T>(NbrPerBlock);
            }
        }
    
        if (Group.Any()) yield return Group; // flush out any remaining
    }
    
    // now it's trivial; if you want to make smaller files, just foreach
    // over this and write out the lines in each block to a new file
    
    public static IEnumerable<ICollection<string>> SplitFile(string filePath)
    {
        return File.ReadLines(filePath).SplitEnumerable(20000);
    }
    

    Is that not sufficient for you? You mention moving from position to position,but I don't see why that's necessary.

    0 讨论(0)
  • 2021-02-06 19:29
    using (System.IO.StreamReader sr = new System.IO.StreamReader("path"))
    {
        int fileNumber = 0;
    
        while (!sr.EndOfStream)
        {
            int count = 0;
    
            using (System.IO.StreamWriter sw = new System.IO.StreamWriter("other path" + ++fileNumber))
            {
                sw.AutoFlush = true;
    
                while (!sr.EndOfStream && ++count < 20000)
                {
                    sw.WriteLine(sr.ReadLine());
                }
            }
        }
    }
    
    0 讨论(0)
  • 2021-02-06 19:32
    int index=0;
    var groups = from line in File.ReadLines("myfile.csv")
                 group line by index++/20000 into g
                 select g.AsEnumerable();
    int file=0;
    foreach (var group in groups)
            File.WriteAllLines((file++).ToString(), group.ToArray());
    
    0 讨论(0)
提交回复
热议问题