Entity framework large data set, out of memory exception

前端 未结 2 793
梦毁少年i
梦毁少年i 2020-12-02 18:44

I am working the a very large data set, roughly 2 million records. I have the code below but get an out of memory exception after it has process around three batches, about

相关标签:
2条回答
  • 2020-12-02 19:19

    The issue is that when you get data from EF there are actually two copies of the data created, one which is returned to the user and a second which EF holds onto and uses for change detection (so that it can persist changes to the database). EF holds this second set for the lifetime of the context and its this set thats running you out of memory.

    You have 2 options to deal with this

    1. renew your context each batch
    2. Use .AsNoTracking() in your query eg:

      IEnumerable<IEnumerable<Town>> towns = dbContext.Towns.AsNoTracking().OrderBy(t => t.TownID).Batch(200000);
      

    this tells EF not to keep a copy for change detection. You can read a little more about what AsNoTracking does and the performance impacts of this on my blog: http://blog.staticvoid.co.nz/2012/4/2/entity_framework_and_asnotracking

    0 讨论(0)
  • 2020-12-02 19:19

    I wrote a migration routine that reads from one DB and writes (with minor changes in layout) into another DB (of a different type) and in this case, renewing the connection for each batch and using AsNoTracking() did not cut it for me.

    Note that this problem occurs using a '97 version of JET. It may work flawlessly with other DBs.

    However, the following algorithm did solve the Out-of-memory issue:

    • use one connection for reading and one for writing/updating
    • Read with AsNoTracking()
    • every 50 rows or so written/updated, check the memory usage, recover memory + reset output DB context (and connected tables) as needed:

      var before = System.Diagnostics.Process.GetCurrentProcess().VirtualMemorySize64;
      if (before > 800000000)
      {
          dbcontextOut.SaveChanges();
          dbcontextOut.Dispose();
          GC.Collect();
          GC.WaitForPendingFinalizers();
          dbcontextOut = dbcontextOutFunc();
          tableOut = Dynamic.InvokeGet(dbcontextOut, outputTableName);
      }
      
    0 讨论(0)
提交回复
热议问题