Is it better to use INNER JOIN or EXISTS to find belonging to several in m2m relation?

前端未结

关注

 4  1494

Given m2m relation: items-categories I have three tables:

items,
categories and
ite

相关标签:

4条回答

旧时难觅i

2021-02-18 15:22

A JOIN is more efficient, generally speaking.

However, one thing to be aware of is that joins can produce duplicate rows in your output. For example, if item id was in category 1 and 3, the first JOIN would result in two rows for id 123. If item id 999 was in categories 1,3,7,8,12, and 66, you would get eight rows for 999 in your results (2*2*2).

Duplicate rows are something you need to be aware of and handle. In this case, you could just use select distinct id.... Eliminating duplicates can get more complicated with a complex query, though.

0 讨论(0)

发布评论:

提交评论

加载中...

[愿得一人]

2021-02-18 15:23

You are using Join in Option A and subquery in Option B. The difference is:

In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.

In JOINs RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.

The good thing in sub-queries is that they are more readable than JOINs: that's why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.

0 讨论(0)

发布评论:

提交评论

加载中...

生来不讨喜

2021-02-18 15:37

OPTION A

JOIN has an advantage over EXIST , because it will more efficiently use the indices, especially in case of large tables

0 讨论(0)

发布评论:

提交评论

加载中...

终归单人心

2021-02-18 15:37

select distinct `user_posts_id` from `user_posts_boxes` where `user_id` = 5 and exists (select * from `box` where `user_posts_boxes`.`box_id` = `box`.`id` and `status` in ("A","F")) order by `user_posts_id` desc limit 200; select distinct `user_posts_id` from `user_posts_boxes` INNER JOIN box on box.id = `user_posts_boxes`.`box_id` and box.`status` in ("A","F") and box.user_id = 5 order by `user_posts_id` desc limit 200

I tried with both query, But above query works faster for me.Both tables having large dataset. Almost "user_posts_boxes" has 4 million and boxes are 1.5 million.

First query took = 0.147 ms 2nd Query almost = 0.5 to 0.9 MS

But my database tables are inno db and having physical relationships are also applied.

SO I should go for exists but it also depends upon how you have your db structure.

0 讨论(0)

发布评论:

提交评论

加载中...

验证码

看不清?

提交回复