When reading a CSV file using a DataReader and the OLEDB Jet data provider, how can I control column data types?

前端未结

关注

 4  1258

In my C# application I am using the Microsoft Jet OLEDB data provider to read a CSV file. The connection string looks like this:

Provider=Microsoft.Jet.OLEDB


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2020-11-30 07:19
              
            
            
                                                                       
You need to tell the driver to scan all rows to determine the schema. Otherwise if the first few rows are numeric and the rest are alphanumeric, the alphanumeric cells will be blank.

Like Rory, I found that I needed to create a schema.ini file dynamically because there is no way to programatically tell the driver to scan all rows. (this is not the case for excel files)

You must have MaxScanRows=0 in your schema.ini

Here's a code example:

    public static DataTable GetDataFromCsvFile(string filePath, bool isFirstRowHeader = true)
    {
        if (!File.Exists(filePath))
        {
            throw new FileNotFoundException("The path: " + filePath + " doesn't exist!");
        }

        if (!(Path.GetExtension(filePath) ?? string.Empty).ToUpper().Equals(".CSV"))
        {
            throw new ArgumentException("Only CSV files are supported");
        }
        var pathOnly = Path.GetDirectoryName(filePath);
        var filename = Path.GetFileName(filePath);
        var schemaIni =
            $"[{filename}]{Environment.NewLine}" +
            $"Format=CSVDelimited{Environment.NewLine}" +
            $"ColNameHeader={(isFirstRowHeader ? "True" : "False")}{Environment.NewLine}" +
            $"MaxScanRows=0{Environment.NewLine}" +
            $" ; scan all rows for data type{Environment.NewLine}" +
            $" ; This file was automatically generated";
        var schemaFile = pathOnly != null ? Path.Combine(pathOnly, "schema.ini") : "schema.ini";
        File.WriteAllText(schemaFile, schemaIni);

        try
        {
            var sqlCommand = $@"SELECT * FROM [{filename}]";

            var oleDbConnString =
                $"Provider=Microsoft.Jet.OLEDB.4.0;Data Source={pathOnly};Extended Properties=\"Text;HDR={(isFirstRowHeader ? "Yes" : "No")}\"";

            using (var oleDbConnection = new OleDbConnection(oleDbConnString))
            using (var adapter = new OleDbDataAdapter(sqlCommand, oleDbConnection))
            using (var dataTable = new DataTable())
            {
                adapter.FillSchema(dataTable, SchemaType.Source);
                adapter.Fill(dataTable);
                return dataTable;
            }
        }
        finally
        {
            if (File.Exists(schemaFile))
            {
                File.Delete(schemaFile);
            }
        }
    }


You'll need to do some modification if you are running this on the same directory in multiple threads at the same time.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  失恋的感觉        
                
              
                            
                2020-11-30 07:21
              
            
            
                                                                       
Please check

http://kbcsv.codeplex.com/

using (var reader = new CsvReader("data.csv"))
{
    reader.ReadHeaderRecord();
    foreach (var record in reader.DataRecords)
    {
        var name = record["Name"];
        var age = record["Age"];
    }
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2020-11-30 07:22
              
            
            
                                                                       
To expand on Marc's answer, I need to create a text file called Schema.ini and put it in the same directory as the CSV file. As well as column types, this file can specify the file format, date time format, regional settings, and the column names if they're not included in the file. 

To make the example I gave in the question work, the Schema file should look like this:

[Data.csv]
ColNameHeader=True
Col1=House Text
Col2=Street Text
Col3=Town Text


I could also try this to make the data provider examine all the rows in the file before it tries to guess the data types:

[Data.csv]
ColNameHeader=true
MaxScanRows=0


In real life, my application imports data from files with dynamic names, so I have to create a Schema.ini file on the fly and write it to the same directory as the CSV file before I open my connection.

Further details can be found here - http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx - or by searching the MSDN Library for "Schema.ini file".
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-11-30 07:30
              
            
            
                                                                       
There's a schema file you can create that would tell ADO.NET how to interpret the CSV - in effect giving it a structure.

Try this: http://www.aspdotnetcodes.com/Importing_CSV_Database_Schema.ini.aspx

Or the most recent MS Documentation
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复