TOP slows down query

前端未结

关注

 3  1911

I have a pivot query on a table with millions of rows. Running the query normally, it runs in 2 seconds and returns 2983 rows. If I add TOP 1000 to the query it takes 10 secon

相关标签:

3条回答

后悔当初

2021-01-21 18:50

After doing some googling about suggesting an execution plan, I found the solution.

SELECT TOP 1000 * 
FROM (SELECT l.PatientID,
               l.LabID,
               l.Result
          FROM dbo.Labs l
          JOIN (SELECT MAX(LabDate) maxDate, 
                       PatientID, 
                       LabID 
                  FROM dbo.Labs 
              GROUP BY PatientID, LabID) s ON l.PatientID = s.PatientID
                                          AND l.LabID = s.LabID
                                          AND l.LabDate = s.maxDate) A
PIVOT(MIN(A.Result) FOR A.LabID IN ([1],[2],[3],[4],[5],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17])) p
OPTION (HASH JOIN)

OPTION (HASH JOIN) being the thing. The resulting execution plan for the version with TOP looks like the original non-top one, with a TOP tacked on at the end.

Since I was originally doing this in a view what I actually ended up doing was changing JOIN to INNER HASH JOIN

0 讨论(0)

被撕碎了的回忆

2021-01-21 18:55

SELECT  TOP 1000
        *
FROM    (
        SELECT  patientId, labId, result,
                DENSE_RANK() OVER (PARTITION BY patientId, labId ORDER BY labDate DESC) dr
        FROM    labs
        ) q
PIVOT   (
        MIN(result)
        FOR
        labId IN ([1],[2],[3],[4],[5],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17])
        ) p
WHERE   dr = 1
ORDER BY
        patientId

You may also try creating an indexed view like this:

CREATE VIEW
        v_labs_patient_lab
WITH SCHEMABINDING
AS
SELECT  patientId, labId, COUNT_BIG(*) AS cnt
FROM    dbo.labs
GROUP BY
        patientId, labId

CREATE UNIQUE CLUSTERED INDEX
        ux_labs_patient_lab
ON      v_labs_patient_lab (patientId, labId)

and use it in the query:

SELECT  TOP 1000
        *
FROM    (
        SELECT  lr.patientId, lr.labId, lr.result
        FROM    v_labs_patient_lab vl
        CROSS APPLY
                (
                SELECT TOP 1 WITH TIES
                       result
                FROM   labs l
                WHERE  l.patientId = vl.patientId
                       AND l.labId = vl.labId
                ORDER BY
                       l.labDate DESC
                ) lr
        ) q
PIVOT   (
        MIN(result)
        FOR
        labId IN ([1],[2],[3],[4],[5],[7],[8],[9],[10],[11],[12],[13],[14],[15],[16],[17])
        ) p
ORDER BY
        patientId

0 讨论(0)

独厮守ぢ

2021-01-21 18:57
There is a specific order in which queries are processed.

A normal SQL query will be written as follows:
```
SELECT [...]
  FROM [table1]
  JOIN [table2]
    ON [condition]
 WHERE [...]
 GROUP BY [...]
HAVING [...]
 ORDER BY [...]
```
But the processing order is different:
```
FROM [table1]
    ON [condition]
  JOIN [table2]
 WHERE [...]
 GROUP BY [...]
HAVING [...]
SELECT [...]
 ORDER BY [...]
```
When using SELECT DISTINCT [...] or SELECT TOP [...] the processing order will be as follows:
```
FROM [table1]
    ON [condition]
  JOIN [table2]
 WHERE [...]
 GROUP BY [...]
HAVING [...]
SELECT [...] DISTINCT[...]
ORDER BY [...]
TOP [....]
```
Hence it's taking longer as your SELECT TOP 1000 is processed last.

Take a look at this link for further details: http://blogs.msdn.com/b/sqlqueryprocessing/
0 讨论(0)
发布评论:

提交评论
- 加载中...