Reading csv having double quotes with lumenwork csv reader

两盒软妹~` 提交于 2019-12-12 12:02:43

问题


I'm reading a csv file using the Lumenworks csv reader. Below is an example record

"001-0000265-003"|"Some detail"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"

I've created a class with below constructor to read this file

using (var input = new CsvReader(stream, true, '|'))
{
//logic to create an xml here
}

This works fine when there is no double quotes inside details. But when the scinarios like this

"001-0000265-003"|"Some " detail"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"

the reader throws an exception

An unhandled exception of type 'LumenWorks.Framework.IO.Csv.MalformedCsvException' occurred in LumenWorks.Framework.IO.dll

So then I used the CsvReader constructor which takes 7 arguments,

CsvReader(stream, true, '|', '"', '"', '#', LumenWorks.Framework.IO.Csv.ValueTrimmingOptions.All))

But still I'm getting the same error. Please provide any suggestions.

I'm reading some complex filed as follows,

"001-0000265-003"|"ABC 33"X23" CDE 32'X33" AAA, BB'C"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"

回答1:


I've tested it with your sample data and it's pretty difficult to fix this malformed line(f.e. from the Catch-block). So i would not use a quoting-character, but instead just use the pipe-delimiter and remove the " later via csv[i].Trim('"').

Here's a method that parses the file and returns all lines' fields:

private static List<List<string>> GetAllLineFields(string fullPath)
{
    List<List<string>> allLineFields = new List<List<string>>();
    var fileInfo = new System.IO.FileInfo(fullPath);

    using (var reader = new System.IO.StreamReader(fileInfo.FullName, Encoding.Default))
    {
        Char quotingCharacter = '\0'; // no quoting-character;
        Char escapeCharacter = quotingCharacter;
        Char delimiter = '|';
        using (var csv = new CsvReader(reader, true, delimiter, quotingCharacter, escapeCharacter, '\0', ValueTrimmingOptions.All))
        {
            csv.DefaultParseErrorAction = ParseErrorAction.ThrowException;
            //csv.ParseError += csv_ParseError;  // if you want to handle it somewhere else
            csv.SkipEmptyLines = true;

            while (csv.ReadNextRecord())
            {
                List<string> fields = new List<string>(csv.FieldCount);
                for (int i = 0; i < csv.FieldCount; i++)
                {
                    try
                    {
                        string field = csv[i];
                        fields.Add(field.Trim('"'));
                    } catch (MalformedCsvException ex)
                    {
                        // log, should not be possible anymore
                        throw;
                    }
                }
                allLineFields.Add(fields);
            }
        }
    }
    return allLineFields;
}

Test and output with a file that contains your sample data:

List<List<string>> allLineFields = GetAllLineFields(@"C:\Temp\Test\CsvFile.csv");
    foreach (List<string> lineFields in allLineFields)
        Console.WriteLine(string.Join(",", lineFields.Select(s => string.Format("[{0}]", s))));

[001-0000265-003],[Some detail],[detal1],[detail2],[detal3],[detail4],[detail5],[detail6]
[001-0000265-003],[Some " detail],[detal1],[detail2],[detal3],[detail4],[detail5],[detail6]


来源:https://stackoverflow.com/questions/26381067/reading-csv-having-double-quotes-with-lumenwork-csv-reader

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!