SQL — How is DISTINCT so fast without an index?

后端 未结 2 1162
心在旅途
心在旅途 2020-12-22 12:02

I have a database with a table called \'links\' with 600 million rows in it in SQLite. There are 2 columns in the database - a \"src\" column and a \"dest\" column. At prese

相关标签:
2条回答
  • 2020-12-22 12:40

    Think about it. With no ordering applied it can return results in scan order. It just keeps a list (more likely, an efficient struct like a b-tree) of the values seen so far. If a given value isn't found it's returned and added to the bookkeeping structure. Absolutely no need to compare with all the other rows at all.

    0 讨论(0)
  • 2020-12-22 12:57

    To be more accurate, one query is not quicker than the other. More precisely, the amount of time taken until the query is completed should be the same for both queries. The difference is that the query with DISTINCT simply has more rows to return therefore it appears to respond faster since you are recieving rows at a fast rate. However, what is happening under the hood of both is the same table scan. The distinct query has a data structure storing what has been returned and filters duplicates. Therefore, it SHOULD actually take longer until the query completes but (rows returned)/time is larger since there are simply more rows that match. (Also note: some viewers add a query result limit which can make the distinct query appear to run faster (since you hit the result limit and stop)).

    0 讨论(0)
提交回复
热议问题