C# Merging Two or more Text Files side by side

∥☆過路亽.° 提交于 2021-02-16 14:50:24

问题


using (StreamWriter writer = File.CreateText(FinishedFile))
{
    int lineNum = 0;
    while (lineNum < FilesLineCount.Min())
    {
        for (int i = 0; i <= FilesToMerge.Count() - 1; i++)
        {
            if (i != FilesToMerge.Count() - 1)
            {
                var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
                string CurrentLine = string.Join("", CurrentFile);
                writer.Write(CurrentLine + ",");
            }
            else
            {
                var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
                string CurrentLine = string.Join("", CurrentFile);
                writer.Write(CurrentLine + "\n");
            }
        }
        lineNum++;
    }
}

The current way i am doing this is just too slow. I am merging files that are each 50k+ lines long with various amounts of data.

for ex: File 1
1
2
3
4

File 2
4
3
2
1

i need this to merge into being a third file
File 3
1,4
2,3
3,2
4,1

P.S. The user can pick as many files as they want from any locations.
Thanks for the help.


回答1:


You approach is slow because of the Skip and Take in the loops.

You could use a dictionary to collect all line-index' lines:

string[] allFileLocationsToMerge = { "filepath1", "filepath2", "..." };
var mergedLists = new Dictionary<int, List<string>>();
foreach (string file in allFileLocationsToMerge)
{
    string[] allLines = File.ReadAllLines(file);
    for (int lineIndex = 0; lineIndex < allLines.Length; lineIndex++)
    {
        bool indexKnown = mergedLists.TryGetValue(lineIndex, out List<string> allLinesAtIndex);
        if (!indexKnown)
            allLinesAtIndex = new List<string>();
        allLinesAtIndex.Add(allLines[lineIndex]);
        mergedLists[lineIndex] = allLinesAtIndex;
    }
}

IEnumerable<string> mergeLines = mergedLists.Values.Select(list => string.Join(",", list));
File.WriteAllLines("targetPath", mergeLines);



回答2:


Here's another approach - this implementation only stores in memory one set of lines from each file simultaneously, thus reducing memory pressure significantly (if that is an issue).

public static void MergeFiles(string output, params string[] inputs)
{
    var files = inputs.Select(File.ReadLines).Select(iter => iter.GetEnumerator()).ToArray();
    StringBuilder line = new StringBuilder();
    bool any;

    using (var outFile = File.CreateText(output))
    {
        do
        {
            line.Clear();
            any = false;

            foreach (var iter in files)
            {
                if (!iter.MoveNext())
                    continue;

                if (line.Length != 0)
                    line.Append(", ");

                line.Append(iter.Current);
                any = true;
            }

            if (any)
                outFile.WriteLine(line.ToString());
        }
        while (any);
    }

    foreach (var iter in files)
    {
        iter.Dispose();
    }
}

This also handles files of different lengths.



来源:https://stackoverflow.com/questions/50585712/c-sharp-merging-two-or-more-text-files-side-by-side

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!