Bulk Insert records into Active Record table

后端 未结 5 1505
隐瞒了意图╮
隐瞒了意图╮ 2020-11-29 04:51

I found that my Model.create! statements were taking a very long time to run when I added a large number of records at once. Looked at ActiveRecord-Import but

相关标签:
5条回答
  • 2020-11-29 05:15

    I started running into problems with large numbers of records (> 10000), so I modified the code to work in groups of 1000 records at a time. Here is a link to the new code:

    https://gist.github.com/jackrg/76ade1724bd816292e4e

    0 讨论(0)
  • 2020-11-29 05:27

    Using a transaction speeds up bulk inserts a lot!

    Model.transaction do
        many.times{ Model.create! }
    end
    

    If multiple Models are involved, do a Model.transaction for each model, which is affected:

    Model1.transaction do
        Model2.transaction do
            many.times do
                m1 = Model1.create!
                m1.add_model2
            end
        end
    end
    
    0 讨论(0)
  • 2020-11-29 05:32

    Use the activerecord-import gem. Let us say you are reading a CSV file and generating a Product catalogue and you want to insert records in batches of 1000:

    batch,batch_size = [], 1_000 
    CSV.foreach("/data/new_products.csv", :headers => true) do |row|
      batch << Product.new(row)
    
      if batch.size >= batch_size
        Product.import batch
        batch = []
      end
    end
    Product.import batch
    
    0 讨论(0)
  • 2020-11-29 05:32

    Thanks to Chris Heald @cheald for his 2009 article, with showed me that the best way to go was the multi-row insert command.

    Added the following code to my initializers/active_record.rb file, changed my Model.create!(...) calls to Model.import!(...) and away it goes. A couple caveats:

    1) It does not validate the data.
    2) It uses the form of the SQL INSERT command that reads like ...

    INSERT INTO <table> (field-1, field-2, ...) 
           VALUES (value-1-1, value-1-2, ...), (value-2-1, value-2-2, ...), ...`
    

    ... which may not be the correct syntax for all databases, but it works with Postgres. It would not be difficult to alter the code for the appropriate syntax for your SQL version.

    In my particular case, inserting 19K+ records into a simple table on my development machine (MacBook Pro with 8GB RAM, 2.4GHz Intel Core i5 and and SSD) went from 223 seconds using 'model.create!' to 7.2 seconds using a 'model.import!'.

    class ActiveRecord::Base
    
      def self.import!(record_list)
        raise ArgumentError "record_list not an Array of Hashes" unless record_list.is_a?(Array) && record_list.all? {|rec| rec.is_a? Hash }
        key_list, value_list = convert_record_list(record_list)        
        sql = "INSERT INTO #{self.table_name} (#{key_list.join(", ")}) VALUES #{value_list.map {|rec| "(#{rec.join(", ")})" }.join(" ,")}"
        self.connection.insert_sql(sql)
      end
    
      def self.convert_record_list(record_list)
        key_list = record_list.map(&:keys).flatten.uniq.sort
    
        value_list = record_list.map do |rec|
          list = []
          key_list.each {|key| list <<  ActiveRecord::Base.connection.quote(rec[key]) }
          list
        end
    
        return [key_list, value_list]
      end
    end
    
    0 讨论(0)
  • 2020-11-29 05:33

    You can also use the activerecord-insert_many gem. Just make an array of objects!

    events = [{name: "Movie Night", time: "10:00"}, {name: "Tutoring", time: "7:00"}, ...]
    
    Event.insert_many(events)
    
    0 讨论(0)
提交回复
热议问题