问题
There is such code(using PublicActivity gem & Squeel)
def index
@activities = Activity.limit(20).order { created_at.desc }
@one = @activities.where{trackable_type == 'Post'}.includes(trackable: [:author, :project])
@two = @activities.where{trackable_type == 'Project'}.includes trackable: [:owner]
@activities = @one + @two
end
But it creates 8 SQL requests:
SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Post' ORDER BY "activities"."created_at" DESC LIMIT 20
SELECT "posts".* FROM "posts" WHERE "posts"."id" IN (800, 799, 798, 797, 796, 795, 794, 793, 792, 791, 790, 789, 788, 787, 786, 785, 784, 783, 782, 781)
SELECT "users".* FROM "users" WHERE "users"."id" IN (880, 879, 878, 877, 876, 875, 874, 873, 872, 871, 869, 868, 867, 866, 865, 864, 863, 862, 861, 860)
SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79)
SELECT "activities".* FROM "activities" WHERE "activities"."trackable_type" = 'Project' ORDER BY "activities"."created_at" DESC LIMIT 20
SELECT "projects".* FROM "projects" WHERE "projects"."id" IN (80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70, 69, 68, 67, 66, 65, 64, 63, 62, 61)
SELECT "users".* FROM "users" WHERE "users"."id" IN (870, 859, 848, 837, 826, 815, 804, 793, 782, 771, 760, 749, 738, 727, 716, 705, 694, 683, 672, 661)
- activites request are not joined
- some users (post owner and project owner) are loaded twice
- some projects are loaded twice
- @activities is Array. Rails relations merge methods(except
+
) don't work with the code above.
Any ideas to optimize it?
回答1:
In a nutshell, you can't optimize any further without using SQL. This is the way Rails does business. It doesn't allow access to join fields outside the AR model where the query is posed. Therefore to get values in other tables, it does a query on each one.
It also doesn't allow UNION
or fancy WHERE
conditions that provide other ways of solving the problem.
The good news is that these queries are all efficient ones (given that trackable_type is indexed). If the size of the results is anything substantial (say a few dozen rows), the i/o time will dominate the slight additional overhead of 7 simple queries vice 1 complex one.
Even using SQL, it will be difficult to get all the join results you want in one query. (It can be done, but the result will be a hash rather than an AR instance. So dependent code will be ugly.) The one-query-per-table is wired pretty deeply into Active Record.
@Mr.Yoshi's solution is a good compromise using minimal SQL except it doesn't let you selectively load either author
or project
+owner
based on the trackable_type
field.
Edit
The above is all correct for Rails 3. For Rails 4 as @CMW says, the eager_load
method will do the same as includes
using an outer join instead of separate queries. This is why I love SO! I always learn something.
回答2:
A non-rails-4, non-squeel solution is:
def index
@activities = Activity.limit(20).order("created_at desc")
@one = @activities.where(trackable_type: 'Post') .joins(trackable: [:author, :project]).includes(trackable: [:author, :project])
@two = @activities.where(trackable_type: 'Project').joins(trackable: [:owner]) .includes(trackable: [:owner])
@activities = @one + @two
end
The combination of joins
and includes
looks odd, but in my testing it works surprisingly well.
This'll reduce it down to two queries though, not to one. And @activities will still by an array. But maybe using this approach with squeel will solve that, too. I don't use squeel and can't test it, unfortunately.
EDIT: I totally missed the point of this being about polymorphic associations. The above works to force
If you want to use what AR offers, it's a bit hacky but you could define read-only associated projects and posts:
belongs_to :project, read_only: true, foreign_key: :trackable_id
belongs_to :post, read_only: true, foreign_key: :trackable_id
With those the mentioned way of forcing eager loads should work. The where
conditions are still needed, so those associations are only called on the right activities.
def index
@activities = Activity.limit(20).order("created_at desc")
@one = @activities.where(trackable_type: 'Post') .joins(post: [:author, :project]).includes(post: [:author, :project])
@two = @activities.where(trackable_type: 'Project').joins(project: [:owner]) .includes(project: [:owner])
@activities = @one + @two
end
It's no clean solution and the associations should be attr_protected to make sure they aren't set accidentally (that will break polymorphism, I expect), but from my testing it seems to work.
回答3:
Using a simple Switch case in SQL:
def index
table_name = Activity.table_name
@activities = Activity.where(trackable_type: ['Post', 'Project'])
.order("CASE #{table_name}.owner_type WHEN 'Post' THEN 'a' ELSE 'z' END, #{table_name}.created_at DESC")
end
Then you can easily add the includes you want ;)
回答4:
I believe you will need at least two AR query invocations (as you currently have) because of the limit(20)
clause. Your queries currently gives you up to 20 Posts, AND up to 20 Projects, so doing an aggregate limit on both activity types in a single query would not give the intended result.
I think all you need to do is use eager_load
in the query instead of includes
to force a single query. The differences between joins
, includes
, preload
, eager_load
and references
methods are nicely covered here
So, with AR and squeel:
def index
@activities = Activity.limit(20).order { created_at.desc }
@one = @activities.where{trackable_type == 'Post'}.eager_loads(trackable: [:author, :project])
@two = @activities.where{trackable_type == 'Project'}.eager_loads trackable: [:owner]
@activities = @one + @two
end
And without the squeel, using just regular ActiveRecord 4:
def index
@activities = Activity.limit(20).order(created_at: :desc)
@one = @activities.where(trackable_type: 'Post').eager_loads(trackable: [:author, :project])
@two = @activities.where(trackable_type: 'Project').eager_loads(trackable: :owner)
@activities = @one + @two
end
You don't need squeel, I recently ripped it out of my project because it doesn't work properly for a number of complex queries in my experience, where AR 4 and Arel were ok.
回答5:
That's a pretty big query there ... by the looks of it you could do it in one select, but for readability I'll use two, one for projects and one for posts.
This assumes a 1:1 relationship between activity and post/project. If this isn't correct, the problem can be solved with a subquery
select * from activities a
where a.trackable_type = 'Post'
left join posts p
on p.id = a.trackable_id -- or whatever fields join these two tables
left join users u
on a.user_id = u.id --this is joining to the main table, may want to join trackable, not sure
left join projects p
on a.project_id = p.id
order by a.created_at DESC LIMIT 20
Or, if there is a 1:many relationship, something like this:
select * from
( select * from activities a
where a.trackable_type = 'Post'
order by a.created_at DESC LIMIT 20 ) activities
left join posts p
...
Edit: As I read this, I realize that I'm a bit old fashioned .... I think if you were going to use such large raw sql queries, you should make a database function, rather than coding it into your application
来源:https://stackoverflow.com/questions/19322580/optimize-difficult-query-possibly-with-squeel