How to read a CSV file into a .NET Datatable

后端 未结 22 2131
野性不改
野性不改 2020-11-22 05:12

How can I load a CSV file into a System.Data.DataTable, creating the datatable based on the CSV file?

Does the regular ADO.net functionality allow this?

相关标签:
22条回答
  • 2020-11-22 06:02

    I've recently written a CSV parser for .NET that I'm claiming is currently the fastest available as a nuget package: Sylvan.Data.Csv.

    Using this library to load a DataTable is extremely easy.

    using var tr = File.OpenText("data.csv");
    using var dr = CsvDataReader.Create(tr);
    var dt = new DataTable();
    dt.Load(dr);
    
    

    Assuming your file is a standard comma separated files with headers, that's all you need. There are also options to allow reading files without headers, and using alternate delimiters etc.

    It is also possible to provide a custom schema for the CSV file so that columns can be treated as something other than string values. This will allow the DataTable columns to be loaded with values that can be easier to work with, as you won't have to coerce them when you access them.

    var schema = new TypedCsvSchema();
    schema.Add(0, typeof(int));
    schema.Add(1, typeof(string));
    schema.Add(2, typeof(double?));
    schema.Add(3, typeof(DateTime));
    schema.Add(4, typeof(DateTime?));
    
    var options = new CsvDataReaderOptions { 
        Schema = schema 
    };
    
    using var tr = GetData();
    using var dr = CsvDataReader.Create(tr, options);
    
    

    TypedCsvSchema is an implementation of ICsvSchemaProvider which provides a simple way to define the types of the columns. However, it is also possible to provide a custom ICsvSchemaProvider when you want to provide more metadata, such as uniqueness or constrained column size, etc.

    0 讨论(0)
  • 2020-11-22 06:03

    Can't resist adding my own spin to this. This is so much better and more compact than what I've used in the past.

    This solution:

    • Does not depend on a database driver or 3rd party library.
    • Will not fail on duplicate column names
    • Handles commas in the data
    • Handles any delimiter, not just commas (although that is the default)

    Here's what I came up with:

      Public Function ToDataTable(FileName As String, Optional Delimiter As String = ",") As DataTable
        ToDataTable = New DataTable
        Using TextFieldParser As New Microsoft.VisualBasic.FileIO.TextFieldParser(FileName) With
          {.HasFieldsEnclosedInQuotes = True, .TextFieldType = Microsoft.VisualBasic.FileIO.FieldType.Delimited, .TrimWhiteSpace = True}
          With TextFieldParser
            .SetDelimiters({Delimiter})
            .ReadFields.ToList.Unique.ForEach(Sub(x) ToDataTable.Columns.Add(x))
            ToDataTable.Columns.Cast(Of DataColumn).ToList.ForEach(Sub(x) x.AllowDBNull = True)
            Do Until .EndOfData
              ToDataTable.Rows.Add(.ReadFields.Select(Function(x) Text.BlankToNothing(x)).ToArray)
            Loop
          End With
        End Using
      End Function
    

    It depends on an extension method (Unique) to handle duplicate column names to be found as my answer in How to append unique numbers to a list of strings

    And here's the BlankToNothing helper function:

      Public Function BlankToNothing(ByVal Value As String) As Object 
        If String.IsNullOrEmpty(Value) Then Return Nothing
        Return Value
      End Function
    
    0 讨论(0)
  • 2020-11-22 06:06
        private static DataTable LoadCsvData(string refPath)
        {
            var cfg = new Configuration() { Delimiter = ",", HasHeaderRecord = true };
            var result = new DataTable();
            using (var sr = new StreamReader(refPath, Encoding.UTF8, false, 16384 * 2))
            {
                using (var rdr = new CsvReader(sr, cfg))
                using (var dataRdr = new CsvDataReader(rdr))
                {
                    result.Load(dataRdr);
                }
            }
            return result;
        }
    

    using: https://joshclose.github.io/CsvHelper/

    0 讨论(0)
  • 2020-11-22 06:07

    You can achieve it by using Microsoft.VisualBasic.FileIO.TextFieldParser dll in C#

    static void Main()
            {
                string csv_file_path=@"C:\Users\Administrator\Desktop\test.csv";
    
                DataTable csvData = GetDataTabletFromCSVFile(csv_file_path);
    
                Console.WriteLine("Rows count:" + csvData.Rows.Count);
    
                Console.ReadLine();
            }
    
    
    private static DataTable GetDataTabletFromCSVFile(string csv_file_path)
            {
                DataTable csvData = new DataTable();
    
                try
                {
    
                using(TextFieldParser csvReader = new TextFieldParser(csv_file_path))
                    {
                        csvReader.SetDelimiters(new string[] { "," });
                        csvReader.HasFieldsEnclosedInQuotes = true;
                        string[] colFields = csvReader.ReadFields();
                        foreach (string column in colFields)
                        {
                            DataColumn datecolumn = new DataColumn(column);
                            datecolumn.AllowDBNull = true;
                            csvData.Columns.Add(datecolumn);
                        }
    
                        while (!csvReader.EndOfData)
                        {
                            string[] fieldData = csvReader.ReadFields();
                            //Making empty value as null
                            for (int i = 0; i < fieldData.Length; i++)
                            {
                                if (fieldData[i] == "")
                                {
                                    fieldData[i] = null;
                                }
                            }
                            csvData.Rows.Add(fieldData);
                        }
                    }
                }
                catch (Exception ex)
                {
                }
                return csvData;
            }
    
    0 讨论(0)
  • 2020-11-22 06:08
    public class Csv
    {
        public static DataTable DataSetGet(string filename, string separatorChar, out List<string> errors)
        {
            errors = new List<string>();
            var table = new DataTable("StringLocalization");
            using (var sr = new StreamReader(filename, Encoding.Default))
            {
                string line;
                var i = 0;
                while (sr.Peek() >= 0)
                {
                    try
                    {
                        line = sr.ReadLine();
                        if (string.IsNullOrEmpty(line)) continue;
                        var values = line.Split(new[] {separatorChar}, StringSplitOptions.None);
                        var row = table.NewRow();
                        for (var colNum = 0; colNum < values.Length; colNum++)
                        {
                            var value = values[colNum];
                            if (i == 0)
                            {
                                table.Columns.Add(value, typeof (String));
                            }
                            else
                            {
                                row[table.Columns[colNum]] = value;
                            }
                        }
                        if (i != 0) table.Rows.Add(row);
                    }
                    catch(Exception ex)
                    {
                        errors.Add(ex.Message);
                    }
                    i++;
                }
            }
            return table;
        }
    }
    
    0 讨论(0)
  • 2020-11-22 06:08

    Use this, one function solve all problems of comma and quote:

    public static DataTable CsvToDataTable(string strFilePath)
        {
    
            if (File.Exists(strFilePath))
            {
    
                string[] Lines;
                string CSVFilePathName = strFilePath;
    
                Lines = File.ReadAllLines(CSVFilePathName);
                while (Lines[0].EndsWith(","))
                {
                    Lines[0] = Lines[0].Remove(Lines[0].Length - 1);
                }
                string[] Fields;
                Fields = Lines[0].Split(new char[] { ',' });
                int Cols = Fields.GetLength(0);
                DataTable dt = new DataTable();
                //1st row must be column names; force lower case to ensure matching later on.
                for (int i = 0; i < Cols; i++)
                    dt.Columns.Add(Fields[i], typeof(string));
                DataRow Row;
                int rowcount = 0;
                try
                {
                    string[] ToBeContinued = new string[]{};
                    bool lineToBeContinued = false;
                    for (int i = 1; i < Lines.GetLength(0); i++)
                    {
                        if (!Lines[i].Equals(""))
                        {
                            Fields = Lines[i].Split(new char[] { ',' });
                            string temp0 = string.Join("", Fields).Replace("\"\"", "");
                            int quaotCount0 = temp0.Count(c => c == '"');
                            if (Fields.GetLength(0) < Cols || lineToBeContinued || quaotCount0 % 2 != 0)
                            {
                                if (ToBeContinued.GetLength(0) > 0)
                                {
                                    ToBeContinued[ToBeContinued.Length - 1] += "\n" + Fields[0];
                                    Fields = Fields.Skip(1).ToArray();
                                }
                                string[] newArray = new string[ToBeContinued.Length + Fields.Length];
                                Array.Copy(ToBeContinued, newArray, ToBeContinued.Length);
                                Array.Copy(Fields, 0, newArray, ToBeContinued.Length, Fields.Length);
                                ToBeContinued = newArray;
                                string temp = string.Join("", ToBeContinued).Replace("\"\"", "");
                                int quaotCount = temp.Count(c => c == '"');
                                if (ToBeContinued.GetLength(0) >= Cols && quaotCount % 2 == 0 )
                                {
                                    Fields = ToBeContinued;
                                    ToBeContinued = new string[] { };
                                    lineToBeContinued = false;
                                }
                                else
                                {
                                    lineToBeContinued = true;
                                    continue;
                                }
                            }
    
                            //modified by Teemo @2016 09 13
                            //handle ',' and '"'
                            //Deserialize CSV following Excel's rule:
                            // 1: If there is commas in a field, quote the field.
                            // 2: Two consecutive quotes indicate a user's quote.
    
                            List<int> singleLeftquota = new List<int>();
                            List<int> singleRightquota = new List<int>();
    
                            //combine fileds if number of commas match
                            if (Fields.GetLength(0) > Cols) 
                            {
                                bool lastSingleQuoteIsLeft = true;
                                for (int j = 0; j < Fields.GetLength(0); j++)
                                {
                                    bool leftOddquota = false;
                                    bool rightOddquota = false;
                                    if (Fields[j].StartsWith("\"")) 
                                    {
                                        int numberOfConsecutiveQuotes = 0;
                                        foreach (char c in Fields[j]) //start with how many "
                                        {
                                            if (c == '"')
                                            {
                                                numberOfConsecutiveQuotes++;
                                            }
                                            else
                                            {
                                                break;
                                            }
                                        }
                                        if (numberOfConsecutiveQuotes % 2 == 1)//start with odd number of quotes indicate system quote
                                        {
                                            leftOddquota = true;
                                        }
                                    }
    
                                    if (Fields[j].EndsWith("\""))
                                    {
                                        int numberOfConsecutiveQuotes = 0;
                                        for (int jj = Fields[j].Length - 1; jj >= 0; jj--)
                                        {
                                            if (Fields[j].Substring(jj,1) == "\"") // end with how many "
                                            {
                                                numberOfConsecutiveQuotes++;
                                            }
                                            else
                                            {
                                                break;
                                            }
                                        }
    
                                        if (numberOfConsecutiveQuotes % 2 == 1)//end with odd number of quotes indicate system quote
                                        {
                                            rightOddquota = true;
                                        }
                                    }
                                    if (leftOddquota && !rightOddquota)
                                    {
                                        singleLeftquota.Add(j);
                                        lastSingleQuoteIsLeft = true;
                                    }
                                    else if (!leftOddquota && rightOddquota)
                                    {
                                        singleRightquota.Add(j);
                                        lastSingleQuoteIsLeft = false;
                                    }
                                    else if (Fields[j] == "\"") //only one quota in a field
                                    {
                                        if (lastSingleQuoteIsLeft)
                                        {
                                            singleRightquota.Add(j);
                                        }
                                        else
                                        {
                                            singleLeftquota.Add(j);
                                        }
                                    }
                                }
                                if (singleLeftquota.Count == singleRightquota.Count)
                                {
                                    int insideCommas = 0;
                                    for (int indexN = 0; indexN < singleLeftquota.Count; indexN++)
                                    {
                                        insideCommas += singleRightquota[indexN] - singleLeftquota[indexN];
                                    }
                                    if (Fields.GetLength(0) - Cols >= insideCommas) //probabaly matched
                                    {
                                        int validFildsCount = insideCommas + Cols; //(Fields.GetLength(0) - insideCommas) may be exceed the Cols
                                        String[] temp = new String[validFildsCount];
                                        int totalOffSet = 0;
                                        for (int iii = 0; iii < validFildsCount - totalOffSet; iii++)
                                        {
                                            bool combine = false;
                                            int storedIndex = 0;
                                            for (int iInLeft = 0; iInLeft < singleLeftquota.Count; iInLeft++)
                                            {
                                                if (iii + totalOffSet == singleLeftquota[iInLeft])
                                                {
                                                    combine = true;
                                                    storedIndex = iInLeft;
                                                    break;
                                                }
                                            }
                                            if (combine)
                                            {
                                                int offset = singleRightquota[storedIndex] - singleLeftquota[storedIndex];
                                                for (int combineI = 0; combineI <= offset; combineI++)
                                                {
                                                    temp[iii] += Fields[iii + totalOffSet + combineI] + ",";
                                                }
                                                temp[iii] = temp[iii].Remove(temp[iii].Length - 1, 1);
                                                totalOffSet += offset;
                                            }
                                            else
                                            {
                                                temp[iii] = Fields[iii + totalOffSet];
                                            }
                                        }
                                        Fields = temp;
                                    }
                                }
                            }
                            Row = dt.NewRow();
                            for (int f = 0; f < Cols; f++)
                            {
                                Fields[f] = Fields[f].Replace("\"\"", "\""); //Two consecutive quotes indicate a user's quote
                                if (Fields[f].StartsWith("\""))
                                {
                                    if (Fields[f].EndsWith("\""))
                                    {
                                        Fields[f] = Fields[f].Remove(0, 1);
                                        if (Fields[f].Length > 0)
                                        {
                                            Fields[f] = Fields[f].Remove(Fields[f].Length - 1, 1);
                                        }
                                    }
                                }
                                Row[f] = Fields[f];
                            }
                            dt.Rows.Add(Row);
                            rowcount++;
                        }
                    }
                }
                catch (Exception ex)
                {
                    throw new Exception( "row: " + (rowcount+2) + ", " + ex.Message);
                }
                //OleDbConnection connection = new OleDbConnection(string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}; Extended Properties=""text;HDR=Yes;FMT=Delimited"";", FilePath + FileName));
                //OleDbCommand command = new OleDbCommand("SELECT * FROM " + FileName, connection);
                //OleDbDataAdapter adapter = new OleDbDataAdapter(command);
                //DataTable dt = new DataTable();
                //adapter.Fill(dt);
                //adapter.Dispose();
                return dt;
            }
            else
                return null;
    
            //OleDbConnection connection = new OleDbConnection(string.Format(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0}; Extended Properties=""text;HDR=Yes;FMT=Delimited"";", strFilePath));
            //OleDbCommand command = new OleDbCommand("SELECT * FROM " + strFileName, connection);
            //OleDbDataAdapter adapter = new OleDbDataAdapter(command);
            //DataTable dt = new DataTable();
            //adapter.Fill(dt);
            //return dt;
        }
    
    0 讨论(0)
提交回复
热议问题