Ruby on Rails memory leak when looping through large number of records; find_each doesn't help

前端 未结 3 983
醉酒成梦
醉酒成梦 2020-12-28 14:08

I have a Rails app that processes a large (millions) number of records in a mysql database. Once it starts working, its memory use quickly grows at a speed of 50MB per secon

相关标签:
3条回答
  • 2020-12-28 14:34

    find_each calls find_in_batches with a batch size of 1000 under the hood.

    All the records in the batch will be created and retained in memory as long as the batch is being processed.

    If your records are large or if they consume a lot of memory via proxy collections (e.g. has_many caches all of its items anytime you use it), you can also try a smaller batch size:

      Person.find_each batch_size: 100 do |person|
        # whatever operation
      end
    

    You can also try manually calling GC.start periodically (e.g. every 300 items)

    0 讨论(0)
  • 2020-12-28 14:53

    I was able to figure this out myself. There are two places to change.

    First, disable IdentityMap. In config/application.rb

    config.active_record.identity_map = false
    

    Second, use uncached to wrap up the loop

    class MemoryTestController < ApplicationController
      def go
        ActiveRecord::Base.uncached do
          Person.find_each do |person|
            # whatever operation
          end
        end
      end
    end
    

    Now my memory use is under control. Hope this helps other people.

    0 讨论(0)
  • 2020-12-28 14:54

    As nice as ActiveRecord is, it is not the best tool for all problems. I recommend dropping down to your native database adapter and doing the work at that level.

    0 讨论(0)
提交回复
热议问题