Using NEWID() with CTE to produce random subset of rows produces odd results

前端 未结 1 588
情话喂你
情话喂你 2021-01-15 04:42

I\'m writing some SQL in a stored procedure to reduce a dataset to a limited random number of rows that I want to report on.

The report starts with a Group

1条回答
  •  礼貌的吻别
    2021-01-15 05:15

    It is undeterministic how many times the SELECT statement involving NEWID() will be executed.

    If you get a nested loops anti semi join between QueryResults and cte_temp and there is no spool in the plan it will likely be re-evaluated as many times as there are rows in QueryResults this means that for each outer row the set that is being compared against with NOT IN may be entirely different.

    Instead of using a CTE you can materialize the results into a temporary table to avoid this.

    INSERT INTO #T
    SELECT TOP(@SampleLimit) UserId
    FROM   QueryResults
    WHERE  ( GroupId = @GroupId )
    GROUP  BY UserId
    ORDER  BY NEWID() 
    

    Then reference that in the DELETE

    0 讨论(0)
提交回复
热议问题