SQL Server - conditional aggregation with correlation

本秂侑毒 提交于 2019-12-03 05:08:07

There is an easier solution:

SELECT c.Customer, c."User", c."Revenue",
       1.0 * Revenue/ NULLIF(c2.sum_total, 0) AS percentage,
       1.0 * c2.sum_running / NULLIF(c2.sum_total, 0) AS running_percentage
FROM t c CROSS APPLY
     (SELECT SUM(c2.Revenue) AS sum_total,
             SUM(CASE WHEN c2.Revenue >= x.Revenue THEN c2.Revenue ELSE 0 END) 
                 as sum_running
      FROM t c2 CROSS JOIN
           (SELECT c.REVENUE) x
      WHERE c."User" = c2."User"
     ) c2
ORDER BY "User", Revenue DESC;

I am not sure why or if this limitation is in the SQL '92 standard. I did have it pretty well memorized 20 or so years ago, but I don't recall that particular limitation.

I should note:

  • At the time of the SQL 92 standard, lateral joins were not really on the radar. Sybase definitely had no such concept.
  • Other databases do have problems with outer references. In particular, they often limit the scoping to one level deep.
  • The SQL Standard itself tends to highly political (that is, vendor-driven) rather than driven by actual database user requirements. Well, over time, it does move in the right direction.

There is no such limitation in the SQL standard for LATERAL. CROSS APPLY is a vendor-specific extension from Microsoft (Oracle adopted it later for compatibility) and its limitations are obviously not owed to the ISO/IEC SQL standard, since the MS feature pre-dates the standard.

LATERAL in standard SQL is basically just a modifier for joins to allow lateral references in the join tree. There is no limit to the number of columns that can be referenced.

I wouldn't see a reason for the odd restriction to begin with. Maybe it's because CROSS APPLY was originally intended to allow table-valued functions, which was later extended to allow sub-SELECTs.

The Postgres manual explains LATERAL like this:

The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)

The Postgres version of your query (without the more elegant window functions) can be simpler:

SELECT c.*
     , round(revenue        / c2.sum_total, 2) END AS percentage 
     , round(c2.running_sum / c2.sum_total, 2) END AS running_percentage 
FROM   t c, LATERAL (
   SELECT NULLIF(SUM(revenue), 0)::numeric AS sum_total  -- NULLIF, cast once
        , SUM(revenue) FILTER (WHERE revenue >= c.revenue) AS running_sum
   FROM   t
   WHERE  "User" = c."User"
   ) c2
ORDER  BY c."User", c.revenue DESC;
  • Postgres 9.4+ has the more elegant aggregate FILTER for conditional aggregates.

  • NULLIF makes sense, I only suggest a minor simplification.

  • Cast sum_total to numeric once.

  • Round result to match your desired result.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!