I am not so int SQL and I have the following doubt about how to optimize a query. I am using MySql
I have this DB schema:
And this is t
You've done a reasonably good job of writing an efficient query.
You didn't use SELECT *
, which can mess up performance in a query with lots of joins, because it generates bloated and redundant intermediate result sets. But your intermediate result set -- the one you apply ORDER BY
to -- is not bloated.
Your WHERE col = val
clauses mostly mention primary keys of tables (I guess). That's good.
Your big table Market_Commodity_Price_Series
could maybe use a compound covering index. Similarly, some other tables may need that kind of index. But that should be the topic of another question.
Your proposed optimization -- ordering an intermediate result set consisting mostly of id
values -- would help a lot if you were doing ORDER BY ... LIMIT
and using the LIMIT
function to discard most of your results. But you are not doing that.
Without knowing more about your data, it's hard to offer a crisp opinion. But, if it were me I'd use your first query. I'd keep an eye on it as you go into production (and on other complex queries). When (not if) performance starts to deteriorate, then you can do EXPLAIN
and figure out the best way to index your tables. You've done a good job of writing a query that will get your application up and running. Go with it!
One of the approach is to make a separate Read Model Table it comes from CQRS approach with containing all necessary attributes just for select and no any joins but you will need to update the Read Model table each time some other tables changes one more options is to create a View
JOIN
s -- particularly on primary keys -- are not necessarily expensive. It looks like your joins are following the data model.
I wouldn't start optimizing the query without understanding its performance characteristics. How long does it take to run? How many records are being sorted to get the most recent?
Your WHERE
clause appears to be limiting the data considerably. You can also set up an index to help with the WHERE
clause clause -- however, because the fields come from different tables, it can be tricky to use indexes or all of them.
You have a complicated data model that is a bit difficult to follow. It seems possible that you are getting a Cartesian product due to multiple n-m relationships. If so, that can have a big impact on performance, and pre-aggregating the data along each dimension is the way to go.
However, I wouldn't start optimizing the query without understanding how the current one behaves.