问题
Howdie do,
I have the following SQL query which takes about 4 seconds to run:
select
o.id, tu.status_type, m.upload_date
from
(select order_id, max(last_updated) as maxudate from tracking_update group by order_id) t
inner join
tracking_update tu on t.order_id=tu.order_id and t.maxudate=tu.last_updated
right join
fgw247.order o on t.order_id=o.id
left join
manifest m on o.manifest_id=m.id
where
(m.upload_date >= '2015-12-12 00:00:00') or (m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD')
The query joins the following 3 tables:
Order, Manifest and Tracking_Update.
The query will return orders that meet the following criteria:
- Orders < 30 days with any delivery status
- Orders > 30 days and not a delivery status
An order is considered delivered if the latest tracking_update for that order has a status_type of 'D' or 'XD'
Now, the orders are listed in the Order table. The Order table has a column called manifest_id which references the Manifest that was created when the order was uploaded.
A manifest can have multiple orders. This is what is used to determine if an order has been uploaded in the past 30 days.
Finally, the tracking_update table contains the tracking_updates per order. An order can have multiple tracking_updates.
Currently, the tracking_update table is over 1M records long.
Listed below are the create statements for each table:
CREATE TABLE "order" (
"id" int(11) NOT NULL AUTO_INCREMENT,
"ShipmentId" varchar(50) DEFAULT NULL,
"RecipientName" varchar(160) DEFAULT NULL,
"CompanyName" varchar(160) DEFAULT NULL,
"Address1" varchar(160) DEFAULT NULL,
"Address2" varchar(160) DEFAULT NULL,
"City" varchar(50) DEFAULT NULL,
"State" varchar(3) DEFAULT NULL,
"ZIP" int(11) DEFAULT NULL,
"TEL" int(11) DEFAULT NULL,
"Email" varchar(255) DEFAULT NULL,
"Bottles" int(11) DEFAULT NULL,
"Weight" float DEFAULT NULL,
"Resi" tinyint(1) DEFAULT NULL,
"Wave" int(11) DEFAULT NULL,
"url_slug" varchar(50) DEFAULT NULL,
"manifest_id" int(11) DEFAULT NULL,
"shipment_date" datetime DEFAULT NULL,
"tracking_number" varchar(30) DEFAULT NULL,
"shipping_carrier" varchar(10) DEFAULT NULL,
"last_tracking_update" datetime DEFAULT NULL,
"delivery_date" datetime DEFAULT NULL,
"customer_code" varchar(50) DEFAULT NULL,
"sub_cust_code" int(11) DEFAULT NULL,
PRIMARY KEY ("id"),
UNIQUE KEY "ShipmentID" ("ShipmentId"),
KEY "manifest_id" ("manifest_id"),
KEY "order_idx3" ("tracking_number"),
KEY "order_idx4" ("customer_code"),
CONSTRAINT "order_ibfk_1" FOREIGN KEY ("manifest_id") REFERENCES "manifest" ("id")
);
Tracking_Update table:
CREATE TABLE "tracking_update" (
"id" int(11) NOT NULL AUTO_INCREMENT,
"order_id" int(11) DEFAULT NULL,
"ship_update_date" datetime DEFAULT NULL,
"message" varchar(400) DEFAULT NULL,
"location" varchar(100) DEFAULT NULL,
"status_type" varchar(2) DEFAULT NULL,
"last_updated" datetime DEFAULT NULL,
"hash" varchar(32) DEFAULT NULL,
PRIMARY KEY ("id"),
KEY "order_id" ("order_id"),
KEY "tracking_update_idx2" ("status_type"),
KEY "tracking_update_idx3" ("ship_update_date"),
CONSTRAINT "tracking_update_ibfk_1" FOREIGN KEY ("order_id") REFERENCES "order" ("id")
);
Manifest table:
CREATE TABLE "manifest" (
"id" int(11) NOT NULL AUTO_INCREMENT,
"upload_date" datetime DEFAULT NULL,
"name" varchar(100) DEFAULT NULL,
"destination_gateway" varchar(40) DEFAULT NULL,
"arrived" tinyint(1) DEFAULT NULL,
"customer_code" varchar(50) DEFAULT NULL,
"upload_user" varchar(50) DEFAULT NULL,
"trip_id" int(11) DEFAULT NULL,
PRIMARY KEY ("id")
);
Here is also the EXPLAIN on the select statement exported with JSON:
{
"data":
[
{
"id": 1,
"select_type": "PRIMARY",
"table": "m",
"type": "ALL",
"possible_keys": "PRIMARY",
"key": null,
"key_len": null,
"ref": null,
"rows": 220,
"Extra": "Using where"
},
{
"id": 1,
"select_type": "PRIMARY",
"table": "o",
"type": "ref",
"possible_keys": "manifest_id",
"key": "manifest_id",
"key_len": "5",
"ref": "fgw247.m.id",
"rows": 246,
"Extra": "Using index"
},
{
"id": 1,
"select_type": "PRIMARY",
"table": "tu",
"type": "ref",
"possible_keys": "order_id",
"key": "order_id",
"key_len": "5",
"ref": "fgw247.o.id",
"rows": 7,
"Extra": "Using where"
},
{
"id": 1,
"select_type": "PRIMARY",
"table": "<derived2>",
"type": "ref",
"possible_keys": "<auto_key0>",
"key": "<auto_key0>",
"key_len": "11",
"ref": "fgw247.o.id,fgw247.tu.last_updated",
"rows": 13,
"Extra": "Using index"
},
{
"id": 2,
"select_type": "DERIVED",
"table": "tracking_update",
"type": "index",
"possible_keys": "order_id",
"key": "order_id",
"key_len": "5",
"ref": null,
"rows": 1388275,
"Extra": null
}
]
}
Any help is appreciated
UPDATE
This is the current query that I'm using which is MUCH faster:
SELECT
o.*, tu.*
FROM
fgw247.`order` o
JOIN
manifest m
ON
o.`manifest_id` = m.`id`
JOIN
`tracking_update` tu
ON
tu.`order_id` = o.`id` and tu.`ship_update_date` = (select max(last_updated) as last_updated from tracking_update where order_id = o.`id` group by order_id)
WHERE
m.`upload_date` >= '2015-12-14 11:50:12'
OR
(o.`delivery_date` IS NULL AND m.`upload_date` < '2015-12-14 11:50:12')
LIMIT 100
回答1:
Consdider the the following options to optimise the execution of your query:
Join types: are you sure that you need the left and right joins? Do you want to report on orders that are not even included in a manifest? If not, then use inner join instead of left and right.
Additional indexes: There is no index on manifest.upload_date field, you should add one. I would also create a composite index on order_id, update_date, and status_type fields (in this order!) on tracking_update table.
Use
union
instead of theor
criterion in where: Mysql is traditionally not really good at optimisingor
criterion inwhere
clauses. So turn the(m.upload_date >= '2015-12-12 00:00:00') or (m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD')
into a union with 2 queries. The 2 queries will be the same as the existing query up to the where part. The 1st query will have only the(m.upload_date >= '2015-12-12 00:00:00')
condition in the where clause, while the 2nd one will have the(m.upload_date <='2015-12-12 00:00:00' and tu.status_type != 'D' and tu.status_type != 'XD')
criteria.
If you add new indexes, then pls check if the query uses them by running a new explain.
来源:https://stackoverflow.com/questions/34725091/how-to-optimize-the-sql-query-that-joins-3-table