问题
I would like some help with the following join. I have one table (with about 20 million rows) that consists of:
MemberId (Primary Key) | Id (Primary Key) | TransactionDate | Balance
I would like to get the latest Balance for all the customers in one query. I know I could do something like this (I just wrote it from my memory). But this way is terribly slow.
SELECT *
FROM money
WHERE money.Id = (SELECT MAX(Id)
FROM money AS m
WHERE m.MemberId = money.MemberId)
Are there any other (faster/smarter) options?
回答1:
In all optimization tutorials and screencasts that I've endured through, joins are always favoured over subqueries. When using a sub-query the sub-query is executed for each comparison, where as with a join only once.
SELECT *
FROM money m
INNER JOIN (
SELECT memberId, MAX(id) AS maxid
FROM money
GROUP BY memberId
) mmax ON mmax.maxid = m.id AND mmax.memberId = m.memberId
回答2:
JOINing is not the best way to go about this. Consider using a GROUP BY clause to sift out the last transaction for each member, like this:
SELECT MemberId, MAX(Id), TransactionDate, Balance FROM money GROUP BY MemberId
UPDATE
as PKK pointed out, balance will be chosen randomly. It looks like you'll have to perform some sort of join after all. Consider this option:
SELECT MemberId, Id, TransactionDate, Balance FROM money WHERE Id IN (
SELECT MAX(Id) FROM money GROUP BY MemberId
)
回答3:
Other option is to lookup for NULL values in a left join:
SELECT m1.*
FROM money m1
LEFT JOIN money m2 ON m2.memberId = m1.memberId AND m2.id > m1.id
WHERE m2.memberId IS NULL
But of course Umbrella's answer is better.
来源:https://stackoverflow.com/questions/8713476/join-table-on-itself-performance