Invoking a large set of SQL from a Rails 4 application

问题

I have a Rails 4 application that I use in conjunction with sidekiq to run asynchronous jobs. One of the jobs I normally run outside of my Rails application is a large set of complex SQL queries that cannot really be modeled by ActiveRecord. The connection this set of SQL queries has with my Rails app is that it should be executed anytime one of my controller actions is invoked.

Ideally, I'd queue a job from my Rails application within the controller for Sidekiq to go ahead and run the queries. Right now they're stored in an external file, and I'm not entirely sure what the best way is to have Rails run the said SQL.

Any solutions are appreciated.

回答1:

I agree with Sharagoz, if you just need to run a specific query, the best way is to send the query string directly into the connection, like:

ActiveRecord::Base.connection.execute(File.read("myquery.sql"))

If the query is not static and you have to compose it, I would use Arel, it's already present in Rails 4.x:

https://github.com/rails/arel

回答2:

You didn't say what database you are using, so I'm going to assume MySQL.

You could shell out to the mysql binary to do the work:

result = `mysql -u #{user} --password #{password} #{database} < #{huge_sql_filename}`

Or use ActiveRecord::Base.connection.execute(File.read("huge.sql")), but it won't work out of the box if you have multiple SQL statements in your SQL file.

In order to run multiple statements you will need to create an initializer that monkey patches the ActiveRecord::Base.mysql2_connection to allow setting MySQL's CLIENT_MULTI_STATEMENTS and CLIENT_MULTI_RESULTS flags.

Create a new initializer config/initializers/mysql2.rb

module ActiveRecord
  class Base
    # Overriding ActiveRecord::Base.mysql2_connection
    # method to allow passing options from database.yml
    #
    # Example of database.yml
    #
    #   login: &login
    #     socket: /tmp/mysql.sock
    #     adapter: mysql2
    #     host: localhost
    #     encoding: utf8
    #     flags: 131072
    #
    # @param [Hash] config hash that you define in your
    #   database.yml
    # @return [Mysql2Adapter] new MySQL adapter object
    #
    def self.mysql2_connection(config)
      config[:username] = 'root' if config[:username].nil?

      if Mysql2::Client.const_defined? :FOUND_ROWS
        config[:flags] = config[:flags] ? config[:flags] | Mysql2::Client::FOUND_ROWS : Mysql2::Client::FOUND_ROWS
      end

      client = Mysql2::Client.new(config.symbolize_keys)
      options = [config[:host], config[:username], config[:password], config[:database], config[:port], config[:socket], 0]
      ConnectionAdapters::Mysql2Adapter.new(client, logger, options, config)
    end
  end
end

Then update config/database.yml to add flags:

development:
  adapter: mysql2
  database: app_development
  username: user
  password: password
  flags: <%= 65536 | 131072 %>

I just tested this on Rails 4.1 and it works great.

Source: http://www.spectator.in/2011/03/12/rails2-mysql2-and-stored-procedures/

回答3:

Executing one query is - as outlined by other people - quite simply done through

ActiveRecord::Base.connection.execute("SELECT COUNT(*) FROM users")

You are talking about a 20.000 line sql script of multiple queries. Assuming you have the file somewhat under control, you can extract the individual queries from it.

script = Rails.root.join("lib").join("script.sql").read # ah, Pathnames

# this needs to match the delimiter of your queries
STATEMENT_SEPARATOR = ";\n\n"


ActiveRecord::Base.transaction do
  script.split(STATEMENT_SEPARATOR).each do |stmt|
    ActiveRecord::Base.connection.execute(stmt)
  end
end

If you're lucky, then the query delimiter could be ";\n\n", but this depends of course on your script. We had in another example "\x0" as delimiter. The point is that you split the script into queries to send them to the database. I wrapped it in a transaction, to let the database know that there is coming more than one statement. The block commits when no exception is raised while sending the script-queries.

If you do not have the script-file under control, start talking to those who control it to get a reliable delimiter. If it's not under your control and you cannot talk to the one who controls it, you wouldn't execute it, I guess :-).

UPDATE

This is a generic way to solve this. For PostgreSQL, you don't need to split the statements manually. You can just send them all at once via execute. For MySQL, there seem to be solutions to get the adapter into a CLIENT_MULTI_STATEMENTS mode.

回答4:

If you want to execute raw SQL through active record you can use this API: ActiveRecord::Base.connection.execute("SELECT COUNT(*) FROM users")

回答5:

If you are running big SQL every time, i suggest you to create a sql view for it. It be boost the execution time. The other thing is, if possible try to split all those SQL query in such a way that it will be executed parallely instead of sequentially and then push it to sidekiq queue.

You have to use ActiveRecord::Base.connection.execute or ModelClass.find_by_sql to run custom SQL.

Also, put an eye on ROLLBACK transactions, you will find many places where you dont need such ROLLBACK feature. If you avoid that, the query will run faster but it is dangerous.

Thanks all i can suggest.

回答6:

use available database tools to handle the complex queries, such as views, stored procedures etc and call them as other people already suggested (ActiveRecord::Base.connection.execute and ModelClass.find_by_sql for example)- it might very well cut down significantly on query preparation time in the DB and make your code easier to handle
- http://dev.mysql.com/doc/refman/5.0/en/create-view.html
- http://dev.mysql.com/doc/connector-cpp/en/connector-cpp-tutorials-stored-routines-statements.html
abstract your query input parameters into a hash so you can pass it on to sidekiq, don't send SQL strings as this will probably degrade performance (due to query preparation time) and make your life more complicated due to funny SQL driver parsing bugs
run your complex queries in a dedicated named queue and set concurrency to such a value that will prevent your database of getting overwhelmed by the queries as they smell like they could be pretty db heavy
- https://github.com/mperham/sidekiq/wiki/API
- https://github.com/mperham/sidekiq/wiki/Advanced-Options
have a look at Squeel, its a great addition to AR, it might be able to pull some of the things you are doing
- https://github.com/activerecord-hackery/squeel
- http://railscasts.com/episodes/354-squeel

I'll assume you use MySQL for now, but your mileage will vary depending on the DB type that you use. For example, Oracle has some good gems for handling stored procedures, views etc, for example https://github.com/rsim/ruby-plsql

Let me know if some of this stuff doesn't fit your use case and I'll expand

回答7:

I see this post is kind of old. But I would like to add my solution to it. I was in a similar situation; I also needed a way to force feed "PRAGMA foreign_keys = on;" into my sqlite connection (I could not find a previous post that spelled it out how to do it.) Anywho, this worked like a charm for me. It allowed me to write "pretty" sql and still get it executed. Blank lines are ignored by the if statement.

conn = ActiveRecord::Base.establish_connection(adapter:'sqlite3',database:DB_NAME)
sqls = File.read(DDL_NAME).split(';')
sqls.each {|sql| conn.connection.execute(sql<<';') unless sql.strip.size == 0 }
conn.connection.execute('PRAGMA foreign_keys = on;')

回答8:

I had the same problem with a set of sql statements that I needed to execute all in one call to the server. What worked for me was to set up an initializer for Mysql2 adapter (as explained in infused answer) but also do some extra work to process multiple results. A direct call to ActiveRecord::Base.connection.executewould only retrieve the first result and issue an Internal Error.

My solution was to get the Mysql2 adapter and work directly with it:

client = ActiveRecord::Base.connection.raw_connection

Then, as explained here, execute the query and loop through the results:

client.query(multiple_stms_query)
while client.next_result
  result = client.store_result
  # do something with it ...
end

来源：https://stackoverflow.com/questions/24395618/invoking-a-large-set-of-sql-from-a-rails-4-application

标签

sql

ruby-on-rails

ruby

ruby-on-rails-4

sidekiq