问题
Hello I asked a question this morning, and I realized that the problem was not where I was looking (here the original question)
I have this query to randomly generate registries from an address book.
SELECT * FROM address_book ab
WHERE
ab.source = "PB" AND
ab.city_id = :city_id AND
pb_campaign_id = :pb_campaign_id AND
ab.id NOT IN (SELECT address_book_id FROM calls WHERE calls.address_book_id = ab.id AND calls.status_id IN ("C","NO") OR (calls.status_id IN ("NR","OC") AND TIMESTAMPDIFF(MINUTE,calls.updated_at,NOW()) < 30))
ORDER BY RAND()
LIMIT 1';
but I noticed that "order by rand ()" take more than 50s and use up to 25-50% CPU with large tables (100k +) so i looked for solutions here but i didn't find anything that worked. note: ids are not self-incrementing, there may be gaps
Any idea?
回答1:
I would recommend writing this as:
SELECT *
FROM address_book ab
WHERE ab.source = 'PB' AND
ab.city_id = :city_id AND
pb_campaign_id = :pb_campaign_id AND
NOT EXISTS (SELECT 1
FROM calls c
WHERE c.address_book_id = ab.id AND
( c.status_id IN ('C', 'NO') OR
(c.status_id IN ('NR', 'OC') AND c.updated < now() - interval 30 minute)
)
)
ORDER BY RAND()
LIMIT 1;
Note that this changes the logic in the correlated subquery so c.address_book_id = ab.id
always applies. I suspect that is the issue with performance.
Then, create indexes on:
address_book(source, city_id, campaign_id, id)
calls(address_book_id, status_id, updated)
I am guessing that this will be sufficient to improve performance. If there happen to be a zillion rows that match the conditions, then the order by rand()
might be an issue.
回答2:
- I will never suggest for sub query in huge DB its take long execution time.
- Use proper indexing and if its require use inner join(never use left join)
- if possible use your business logic in php script because maybe your db will more large and take too much time for execute such query.
- if you want only one data in large db don't use
rand()
function, take any rand number (1 to db rows count) and use limitlimit skip,number
ex.limit 2,1
its give row 3 only Hope its useful.
来源:https://stackoverflow.com/questions/64266163/problems-with-order-by-rand-and-big-tables