I need to update this table in SQL Server with data from its \'parent\' table, see below:
Table: sale
id (int)
udid
You'll get the best performance if you forget the where clause and place all conditions in the ON expression.
I think this is because the query first has to join the tables then runs the where clause on that, so if you can reduce what is required to join then that's the fasted way to get the results/do the udpate.
You have a table of users. They can log in using their username or email or account_number. These accounts can be active (1) or inactive (0). This table has 50000 rows
You then have a table of users to disable at one go because you find out they've all done something bad. This table however, has one column with usernames, emails and account numbers mixed. It also has a "has_run" indicator which needs to be set to 1 (true) when it has been run
UPDATE users User
INNER JOIN
blacklist_users BlacklistUser
ON
(
User.username = BlacklistUser.account_ref
OR
User.email = BlacklistedUser.account_ref
OR
User.phone_number = BlacklistUser.account_ref
AND
User.is_active = 1
AND
BlacklistUser.has_run = 0
)
SET
User.is_active = 0,
BlacklistUser.has_run = 1;
If we had to join on just the OR conditions it would essentially need to check each row 4 times to see if it should join, and potentially return a lot more rows. However, by giving it more conditions it can "skip" a lot of rows if they don't meet all the conditions when joining.
It's more readable. All the conditions are in one place and the rows to update are in one place