I am a little confused about how SQL server achieves less reads and performance improvements using common table expressions and ROW_NUMBER. Why doesn\'t the table actualized
The CTE
is not (necessarily) "actualized". It's not that it will inevitably copy all rows elsewhere and will perform other operations over the copy (though it may behave so it the optimizer decides it's better).
If we take this simple query:
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) rn
FROM mytable
) q
WHERE rn BETWEEN 101 AND 110
and look at its plan we'll see something like this:
|--Filter(WHERE:([Expr1003]>=(101) AND [Expr1003]<=(110)))
|--Top(TOP EXPRESSION:(CASE WHEN (110) IS NULL OR (110)<(0) THEN (0) ELSE (110) END))
|--Sequence Project(DEFINE:([Expr1003]=row_number))
|--Segment
|--Clustered Index Scan(OBJECT:([ee].[dbo].[mytable].[PK__mytable__3213E83F29C2D227]), ORDERED FORWARD)
Here, the records are scanned (in id
order as the table is clustered on id
), assigned the ROW_NUMBER
(this is what Sequence Project
does) and passed on to TOP
which just halts execution when a certain threshold is reached (110
records in our case).
Those 110 records are passed to Filter
which only passes the records with rn
greater than 100.
The query itself only scans 110
records:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(строк обработано: 10)
Table 'mytable'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
in 3 pages.
Now let's see the unpaginated query:
SELECT *
FROM mytable
ORDER BY
id
This one is pretty simple: read everything from the table and spit it out.
|--Clustered Index Scan(OBJECT:([ee].[dbo].[mytable].[PK__mytable__3213E83F29C2D227]), ORDERED FORWARD)
However, looking easily does not mean done easily. The table is quite large and we need to do many a read to return all records:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
(строк обработано: 1310720)
Table 'mytable'. Scan count 1, logical reads 2765, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 266 ms, elapsed time = 11690 ms.
So, in a nutshell, the pagination query just knows when to stop.