Background:
The original case was very simple. Calculate running total per user from highest revenue to lowest:
CREATE TABLE t(Customer INTEGER NOT NULL
There is an easier solution:
SELECT c.Customer, c."User", c."Revenue",
1.0 * Revenue/ NULLIF(c2.sum_total, 0) AS percentage,
1.0 * c2.sum_running / NULLIF(c2.sum_total, 0) AS running_percentage
FROM t c CROSS APPLY
(SELECT SUM(c2.Revenue) AS sum_total,
SUM(CASE WHEN c2.Revenue >= x.Revenue THEN c2.Revenue ELSE 0 END)
as sum_running
FROM t c2 CROSS JOIN
(SELECT c.REVENUE) x
WHERE c."User" = c2."User"
) c2
ORDER BY "User", Revenue DESC;
I am not sure why or if this limitation is in the SQL '92 standard. I did have it pretty well memorized 20 or so years ago, but I don't recall that particular limitation.
I should note:
There is no such limitation in the SQL standard for LATERAL
. CROSS APPLY
is a vendor-specific extension from Microsoft (Oracle adopted it later for compatibility) and its limitations are obviously not owed to the ISO/IEC SQL standard, since the MS feature pre-dates the standard.
LATERAL
in standard SQL is basically just a modifier for joins to allow lateral references in the join tree. There is no limit to the number of columns that can be referenced.
I wouldn't see a reason for the odd restriction to begin with. Maybe it's because CROSS APPLY
was originally intended to allow table-valued functions, which was later extended to allow sub-SELECT
s.
The Postgres manual explains LATERAL
like this:
The
LATERAL
key word can precede a sub-SELECT FROM
item. This allows the sub-SELECT
to refer to columns ofFROM
items that appear before it in theFROM
list. (WithoutLATERAL
, each sub-SELECT
is evaluated independently and so cannot cross-reference any otherFROM
item.)
The Postgres version of your query (without the more elegant window functions) can be simpler:
SELECT c.*
, round(revenue / c2.sum_total, 2) END AS percentage
, round(c2.running_sum / c2.sum_total, 2) END AS running_percentage
FROM t c, LATERAL (
SELECT NULLIF(SUM(revenue), 0)::numeric AS sum_total -- NULLIF, cast once
, SUM(revenue) FILTER (WHERE revenue >= c.revenue) AS running_sum
FROM t
WHERE "User" = c."User"
) c2
ORDER BY c."User", c.revenue DESC;
Postgres 9.4+ has the more elegant aggregate FILTER
for conditional aggregates.
NULLIF
makes sense, I only suggest a minor simplification.
Cast sum_total
to numeric
once.
Round result to match your desired result.