Reading Comma Delimited Text File to C# DataTable, columns get truncated to 255 characters

后端 未结 5 684
南方客
南方客 2021-02-10 21:11

We are importing from CSV to SQL. To do so, we are reading the CSV file and writing to a temporary .txt file using a schema.ini. (I\'m not sure yet exactly why are are writing t

相关标签:
5条回答
  • 2021-02-10 21:31

    I think the best way to do it is by using CSVReader in the following blog: http://ronaldlemmen.blogspot.com/2008/03/stopping-and-continuing-save-event.html

    0 讨论(0)
  • 2021-02-10 21:32

    Here's a simple class for reading a delimited file and returning a DataTable (all strings) that doesn't truncate strings. It has an overloaded method to specify column names if they're not in the file. Maybe you can use it?

    Imported Namespaces

    using System;
    using System.Text;
    using System.Data;
    using System.IO;
    

    Code

    /// <summary>
    /// Simple class for reading delimited text files
    /// </summary>
    public class DelimitedTextReader
    {
        /// <summary>
        /// Read the file and return a DataTable
        /// </summary>
        /// <param name="filename">File to read</param>
        /// <param name="delimiter">Delimiting string</param>
        /// <returns>Populated DataTable</returns>
        public static DataTable ReadFile(string filename, string delimiter)
        {
            return ReadFile(filename, delimiter, null);
        }
        /// <summary>
        /// Read the file and return a DataTable
        /// </summary>
        /// <param name="filename">File to read</param>
        /// <param name="delimiter">Delimiting string</param>
        /// <param name="columnNames">Array of column names</param>
        /// <returns>Populated DataTable</returns>
        public static DataTable ReadFile(string filename, string delimiter, string[] columnNames)
        {
            //  Create the new table
            DataTable data = new DataTable();
            data.Locale = System.Globalization.CultureInfo.CurrentCulture;
    
            //  Check file
            if (!File.Exists(filename))
                throw new FileNotFoundException("File not found", filename);
    
            //  Process the file line by line
            string line;
            using (TextReader tr = new StreamReader(filename, Encoding.Default))
            {
                //  If column names were not passed, we'll read them from the file
                if (columnNames == null)
                {
                    //  Get the first line
                    line = tr.ReadLine();
                    if (string.IsNullOrEmpty(line))
                        throw new IOException("Could not read column names from file.");
                    columnNames = line.Split(new string[] { delimiter }, StringSplitOptions.RemoveEmptyEntries);
                }
    
                //  Add the columns to the data table
                foreach (string colName in columnNames)
                    data.Columns.Add(colName);
    
                //  Read the file
                string[] columns;
                while ((line = tr.ReadLine()) != null)
                {
                    columns = line.Split(new string[] { delimiter }, StringSplitOptions.None);
                    //  Ensure we have the same number of columns
                    if (columns.Length != columnNames.Length)
                    {
                        string message = "Data row has {0} columns and {1} are defined by column names.";
                        throw new DataException(string.Format(message, columns.Length, columnNames.Length));
                    }
                    data.Rows.Add(columns);
                }
            }
            return data;
    
        }
    }
    

    Required Namespaces

    using System;
    using System.Data;
    using System.Windows.Forms;
    using System.Data.SqlClient;
    using System.Diagnostics;
    

    Here's an example of calling it and uploading to a SQL Database:

            Stopwatch sw = new Stopwatch();
            TimeSpan tsRead;
            TimeSpan tsTrunc;
            TimeSpan tsBcp;
            int rows;
            sw.Start();
            using (DataTable dt = DelimitedTextReader.ReadFile(textBox1.Text, "\t"))
            {
                tsRead = sw.Elapsed;
                sw.Reset();
                rows = dt.Rows.Count;
                string connect = @"Data Source=.;Initial Catalog=MyDB;Integrated Security=SSPI";
                using (SqlConnection cn = new SqlConnection(connect))
                using (SqlCommand cmd = new SqlCommand("TRUNCATE TABLE dbo.UploadTable", cn))
                using (SqlBulkCopy bcp = new SqlBulkCopy(cn))
                {
                    cn.Open();
                    sw.Start();
                    cmd.ExecuteNonQuery();
                    tsTrunc = sw.Elapsed;
                    sw.Reset();
    
                    sw.Start();
                    bcp.DestinationTableName = "dbo.UploadTable";
                    bcp.ColumnMappings.Add("Column A", "ColumnA");
                    bcp.ColumnMappings.Add("Column D", "ColumnD");
                    bcp.WriteToServer(dt);
                    tsBcp = sw.Elapsed;
                    sw.Reset();
                }
            }
    
            string message = "File read:\t{0}\r\nTruncate:\t{1}\r\nBcp:\t{2}\r\n\r\nTotal time:\t{3}\r\nTotal rows:\t{4}";
            MessageBox.Show(string.Format(message, tsRead, tsTrunc, tsBcp, tsRead + tsTrunc + tsBcp, rows));
    
    0 讨论(0)
  • 2021-02-10 21:40

    My inclination would be to create the DataTable directly when reading the CSV file, rather than going through the extra step of writing the data out to a different text file, only to read it back into memory a second time.

    For that matter, how are you ultimately getting the data from the DataTable into the SQL database? If you're just looping through the DataTable and doing a bunch of INSERT statements, why not skip two middlemen and call the same INSERT statements while you're initially reading the CSV file?

    0 讨论(0)
  • 2021-02-10 21:43

    The Jet database engine truncates memo fields if you ask it to process the data based on the memo: aggregating, de-duplicating, formatting, and so on.

    http://allenbrowne.com/ser-63.html

    0 讨论(0)
  • 2021-02-10 21:50

    You can correct this by correctly specifying your schema.ini file. I believe the two options are to either set the column to a Memo type, or to set the Width > 255.

    0 讨论(0)
提交回复
热议问题