I am trying to read a CSV file with CsvHelper, load each record into a DataTable, and then use SqlBulkCopy to insert the data into a database table. With the current code, I get
I liked the answer from @JoshClose, but I found while( csv.Read() )
to be considerably slower than csv.GetRecords<{Class}>().ToList()
. It also doesn't correctly handle many nullable types like int?
when the value returned should be DBNull. My answer is to have CsvHelper import a list of dynamic records and then use a couple helper methods to auto map to the DataTable.
var records = csv.GetRecords<dynamic>().ToList();
foreach ( record in records )
{
var row = dt.NewRow();
var recordDictionary = DynamicToDictionary( record );
foreach( DataColumn column in dt.Columns )
{
row[column.ColumnName] = GetColumnValue( column, recordDictionary );
}
dt.Rows.Add( row );
}
The DynamicToDictionary
method handles case sensitivity and header white space. I convert the dynamic object to a Dictionary object that ignores case sensitivity and removes header white space. This could be skipped and the dynamic object passed directly to GetColumnValue
if this isn't an issue.
public Dictionary<string, object> DynamicToDictionary(dynamic dynObj)
{
var dictionary = new Dictionary<string, object>(StringComparer.OrdinalIgnoreCase);
foreach (var kvp in (IDictionary<string, object>) dynObj)
{
var obj = kvp.Value;
// Remove white space.
var name = new string(kvp.Key.ToCharArray().Where(c => !char.IsWhiteSpace(c)).ToArray());
dictionary.Add(name, obj);
}
return dictionary;
}
The GetColumnValue
method finds and converts the dynamic record value into the proper DataTable column value.
public object GetColumnValue(DataColumn column, IDictionary<string, object> dynamicDictionary)
{
object value;
// Return DBNull if the column name isn't found.
if (!dynamicDictionary.TryGetValue(column.ColumnName, out value))
{
return DBNull.Value;
}
// Null values come in as empty strings.
if (column.AllowDBNull && column.DataType != typeof(string) && (string)value == "")
{
return DBNull.Value;
}
if (column.DataType == typeof(bool))
{
return (string)value != "0" && ((string)value).ToLower() != "false";
}
return value;
}
If I'm not mistaken, you should be able to do it with a lot less code. You don't have to put in into another class before going into the DataTable
either.
while( csv.Read() )
{
var row = dt.NewRow();
foreach( DataColumn column in dt.Columns )
{
row[column.ColumnName] = csv.GetField( column.DataType, column.ColumnName );
}
dt.Rows.Add( row );
}
I was able to get this to work by adding a DataTable row and filling it in explicitly, instead of trying to add a CsvHelper record as a row.
I used the following part instead of the similar part that is shown above:
foreach (var record in records)
{
DataRow row = dt.NewRow();
record.CompanyId = company.Id;
row["Date"] = record.Date;
row["Close"] = record.Close;
row["AdjClose"] = record.AdjClose;
row["High"] = record.High;
row["Low"] = record.Low;
row["Open"] = record.Open;
row["Volume"] = record.Volume;
row["CompanyId"] = record.CompanyId;
dt.Rows.Add(row);
}
If you can solve the issue without so much hard coding, I will accept your answer as the answer.
Josh added support to read headers last year and the following block may be useful to those who just want to build a DataTable using the schema of the CSV document. I wanted to post this as a comment to Josh's answer as it is only a small modification but posting as an answer as I couldn't format a code block in Comments.
private DataTable BuildDataTable()
{
var dt = new DataTable();
using (var textReader = new StreamReader(_path))
{
using (var csv = new CsvReader(textReader))
{
csv.ReadHeader();
foreach (var header in csv.FieldHeaders)
{
dt.Columns.Add(header);
}
while (csv.Read())
{
var row = dt.NewRow();
foreach (DataColumn column in dt.Columns)
{
row[column.ColumnName] = csv.GetField(column.DataType, column.ColumnName);
}
dt.Rows.Add(row);
}
}
}
return dt;
}