Why is DataTable faster than DataReader

前端 未结 4 1624
轻奢々
轻奢々 2020-11-30 06:12

So we have had a heated debate at work as to which DataAccess route to take: DataTable or DataReader.

DISCLAIMER I am on the DataReader side and the

相关标签:
4条回答
  • 2020-11-30 06:28

    2 things could be slowing you down.

    First, I wouldn't do a "find ordinal by name" for each column, if you're interested in performance. Note, the "layout" class below to take care of this lookup. And the layout providers later readability, instead of using "0", "1", "2", etc. And it allows me to code to an Interface (IDataReader) instead of the Concrete.

    Second. You're using the ".Value" property. (and I would think this does make a difference)

    You'll get better results (IMHO) if you use the concrete datatype "getters".

    GetString, GetDateTime, GetInt32, etc,etc.

    Here is my typical IDataReader to DTO/POCO code.

    [Serializable]
    public partial class Employee
    {
        public int EmployeeKey { get; set; }                   
        public string LastName { get; set; }                   
        public string FirstName { get; set; }   
        public DateTime HireDate  { get; set; }  
    }
    
    [Serializable]
    public class EmployeeCollection : List<Employee>
    {
    }   
    
    internal static class EmployeeSearchResultsLayouts
    {
        public static readonly int EMPLOYEE_KEY = 0;
        public static readonly int LAST_NAME = 1;
        public static readonly int FIRST_NAME = 2;
        public static readonly int HIRE_DATE = 3;
    }
    
    
        public EmployeeCollection SerializeEmployeeSearchForCollection(IDataReader dataReader)
        {
            Employee item = new Employee();
            EmployeeCollection returnCollection = new EmployeeCollection();
            try
            {
    
                int fc = dataReader.FieldCount;//just an FYI value
    
                int counter = 0;//just an fyi of the number of rows
    
                while (dataReader.Read())
                {
    
                    if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.EMPLOYEE_KEY)))
                    {
                        item = new Employee() { EmployeeKey = dataReader.GetInt32(EmployeeSearchResultsLayouts.EMPLOYEE_KEY) };
    
                        if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.LAST_NAME)))
                        {
                            item.LastName = dataReader.GetString(EmployeeSearchResultsLayouts.LAST_NAME);
                        }
    
                        if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.FIRST_NAME)))
                        {
                            item.FirstName = dataReader.GetString(EmployeeSearchResultsLayouts.FIRST_NAME);
                        }
    
                        if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.HIRE_DATE)))
                        {
                            item.HireDate = dataReader.GetDateTime(EmployeeSearchResultsLayouts.HIRE_DATE);
                        }
    
    
                        returnCollection.Add(item);
                    }
    
                    counter++;
                }
    
                return returnCollection;
    
            }
            //no catch here... see  http://blogs.msdn.com/brada/archive/2004/12/03/274718.aspx
            finally
            {
                if (!((dataReader == null)))
                {
                    try
                    {
                        dataReader.Close();
                    }
                    catch
                    {
                    }
                }
            }
        }
    
    0 讨论(0)
  • 2020-11-30 06:28

    I don't think it will account for all the difference, but try something like this to eliminate some of the extra variables and function calls:

    using (SqlDataReader reader = command.ExecuteReader())
    {
        while (reader.Read())
        {
            artifactList.Add(new ArtifactString
            {
                FormNumber = reader["FormNumber"].ToString(),
                //etc
            });
         }
    }
    
    0 讨论(0)
  • 2020-11-30 06:44

    SqlDataAdapter.Fill calls SqlCommand.ExecuteReader with CommandBehavior.SequentialAccess set. Maybe that's enough to make the difference.

    As an aside, I see your IDbReader implementation caches the ordinals of each field for performance reasons. An alternative to this approach is to use the DbEnumerator class.

    DbEnumerator caches a field name -> ordinal dictionary internally, so gives you much of the performance benefit of using ordinals with the simplicity of using field names:

    foreach(IDataRecord record in new DbEnumerator(reader))
    {
        artifactList.Add(new ArtifactString() {
            FormNumber = (int) record["FormNumber"],
            FormOwner = (int) record["FormOwner"],
            ...
        });
    }
    

    or even:

    return new DbEnumerator(reader)
        .Select(record => new ArtifactString() {
            FormNumber = (int) record["FormNumber"],
            FormOwner = (int) record["FormOwner"],
            ...
          })
        .ToList();
    
    0 讨论(0)
  • 2020-11-30 06:46

    I see three issues:

    1. the way you use a DataReader negates it's big single-item-in-memory advantage by converting it to list,
    2. you're running the benchmark in an environment that differs significantly from production in a way that favors the DataTable, and
    3. you're spending time converting DataReader record to Artifact objects that is not duplicated in the DataTable code.

    The main advantage of a DataReader is that you don't have to load everything into memory at once. This should be a huge advantage for DataReader in web apps, where memory, rather than cpu, is often the bottleneck, but by adding each row to a generic list you've negated this. That also means that even after you change your code to only use one record at a time, the difference might not show up on your benchmarks because you're running them on a system with lot of free memory, which will favor the DataTable. Also, the DataReader version is spending time parsing the results into Artifact objects that the DataTable has not done yet.

    To fix the DataReader usage issue, change List<ArtifactString> to IEnumerable<ArtifactString> everywhere, and in your DataReader DAL change this line:

    artifactList.Add(artifact);
    

    to this:

    yield return artifact;
    

    This means you also need to add code that iterates over the results to your DataReader test harness to keep things fair.

    I'm not sure how to adjust the benchmark to create a more typical scenario that is fair to both DataTable and DataReader, except to build two versions of your page, and serve up each version for an hour under a similar production-level load so that we have real memory pressure... do some real A/B testing. Also, make sure you cover converting the DataTable rows to Artifacts... and if the argument is that you need to do this for a DataReader, but not for a DataTable, that is just plain wrong.

    0 讨论(0)
提交回复
热议问题