SQL — How is DISTINCT so fast without an index?

后端未结

关注

 2  1162

I have a database with a table called \'links\' with 600 million rows in it in SQLite. There are 2 columns in the database - a \"src\" column and a \"dest\" column. At prese

相关标签:

2条回答

我寻月下人不归

2020-12-22 12:40

Think about it. With no ordering applied it can return results in scan order. It just keeps a list (more likely, an efficient struct like a b-tree) of the values seen so far. If a given value isn't found it's returned and added to the bookkeeping structure. Absolutely no need to compare with all the other rows at all.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一整个雨季

2020-12-22 12:57

To be more accurate, one query is not quicker than the other. More precisely, the amount of time taken until the query is completed should be the same for both queries. The difference is that the query with DISTINCT simply has more rows to return therefore it appears to respond faster since you are recieving rows at a fast rate. However, what is happening under the hood of both is the same table scan. The distinct query has a data structure storing what has been returned and filters duplicates. Therefore, it SHOULD actually take longer until the query completes but (rows returned)/time is larger since there are simply more rows that match. (Also note: some viewers add a query result limit which can make the distinct query appear to run faster (since you hit the result limit and stop)).

0 讨论(0)
发布评论:

提交评论
- 加载中...