I am working on query to get cumulative distinct count of uids on daily basis.
Example : Say there are 2 uids (100,200) appeared on date 2016-11-01 and they also appea
WITH firstseen AS (
SELECT uid, MIN(date) date
FROM sample_table
GROUP BY 1
)
SELECT DISTINCT date, COUNT(uid) OVER (ORDER BY date) daily_cumulative_count
FROM firstseen
ORDER BY 1
Using SELECT DISTINCT
because (date, COUNT(uid))
will be duplicated many times.
Explanation: for each date dt
, it counts uid from the earliest date up to dt
, because we are specifying ORDER BY date
and it defaults to BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
.