What is the memory overhead of storing data in a .NET DataTable?

后端 未结 3 1617
旧时难觅i
旧时难觅i 2020-12-05 05:21

I\'m trying to get a handle on the amount of memory overhead associated with a .NET DataTable, and with individual DataRows within a table.
In other words, how much more

相关标签:
3条回答
  • 2020-12-05 05:25

    Well, don't forget that a DataTable stores 2? 3? versions of the data - original and updated (possibly one other?). It also has a lot of references since it is cell-based, and boxing for any value-types. It would be hard to quantify the exact memory...

    Personally, I very rarely use DataTable - typed POCO classes are a much more sensible bet in my view. I wouldn't use an array (directly), though - List<T> or BindingList<T> or similar would be far more common.

    As a crude measure, you could create a lot of tables etc and look at the memory usage; for example, the following shows a ~4.3 factor - i.e. more than 4 times as expensive, but obviously that depends a lot on the number of columns vs rows vs tables etc:

        // takes **roughly** 112Mb  (taskman)
        List<DataTable> tables = new List<DataTable>();
        for (int j = 0; j < 5000; j++)
        {
            DataTable table = new DataTable("foo");
            for (int i = 0; i < 10; i++)
            {
                table.Columns.Add("Col " + i, i % 2 == 0 ? typeof(int)
                                    : typeof(string));
            }
            for (int i = 0; i < 100; i++)
            {
                table.Rows.Add(i, "a", i, "b", i, "c", i, "d", i, "e");
            }
            tables.Add(table);
        }
        Console.WriteLine("done");
        Console.ReadLine();
    

    vs

        // takes **roughly** 26Mb (taskman)
        List<List<Foo>> lists = new List<List<Foo>>(5000);
        for (int j = 0; j < 5000; j++)
        {
            List<Foo> list = new List<Foo>(100);
            for (int i = 0; i < 100; i++)
            {
                Foo foo = new Foo { Prop1 = "a", Prop3 = "b",
                     Prop5 = "c", Prop7 = "d", Prop9 = "e"};
                foo.Prop0 = foo.Prop2 = foo.Prop4 = foo.Prop6 = foo.Prop8 = i;
                list.Add(foo);
            }
            lists.Add(list);
        }
        Console.WriteLine("done");
        Console.ReadLine();
    

    (based on)

    class Foo
    {
        public int Prop0 { get; set; }
        public string Prop1 { get; set; }
        public int Prop2 { get; set; }
        public string Prop3 { get; set; }
        public int Prop4 { get; set; }
        public string Prop5 { get; set; }
        public int Prop6 { get; set; }
        public string Prop7 { get; set; }
        public int Prop8 { get; set; }
        public string Prop9 { get; set; }
    }
    
    0 讨论(0)
  • 2020-12-05 05:28

    Overhead is pretty low if you don't define indexes on columns. You can get a pretty low memory footprint if you use string caching: Use a HashSet or Dictionary to use just 1 string instance of every string value. This sounds weird, but if you fetch data from a database, and you have multiple rows with the same string value (e.g. "ALFKI"), the string values are equal, but the string instances are not: the string is stored multiple times in memory. If you first use a HashSet for filtering out duplicate instances, you effectively use the same string instance for 1 string value everywhere in your datatable. This can greatly reduce memory footprint. Of course, if the string values are already statically defined somewhere (so not read from an outside source), it's not worth the effort.

    0 讨论(0)
  • 2020-12-05 05:38

    It depends on how much data and what kind of data you are storing. Obviously the more data, the more memory. There is some overhead associated with the datatable which makes it a little more expensive. You also need to be aware of the Large Object Heap. If you store objects over 85 kb, the object will be stored in the LOH. This can wreak havoc on your garbage collection as it requires a full collection. If you are up for testing it, look into a memory profiler to watch the memory footprint of the datatable.

    0 讨论(0)
提交回复
热议问题