问题
I have a table that records a row for each unique user per day with some aggregated stats for that user on that day, and I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
eg.
- for Aug 31st, it'll count the unique users from Aug 2nd to Aug 31st
- for Aug 30th, it'll count the unique users from Aug 1st to Aug 30th
- and so on...
I've looked at some related questions but they aren't quite what I need - if a user logs in on multiple days in the last 30 days he should be counted only once, so I can't just sum the DAU count for the last 30 days.
Bigquery SQL for sliding window aggregate
BigQuery SQL for 28-day sliding window aggregate (without writing 28 lines of SQL)
So far, my ideas are to either:
- write a simple script that'll execute a separate BigQuery for each of the relevant days
- write a BigQuery UDF that'll execute basically the same query for each day selected from another query
but I've not found any examples on how to execute another BigQuery query inside an UDF, or if it's possible at all.
回答1:
I need to produce a report that tells me for each day, the no. of unique users in the last 30 days including that day.
Below should do this
SELECT
calendar_day,
EXACT_COUNT_DISTINCT(userID) AS unique_users
FROM (
SELECT calendar_day, userID
FROM YourTable
CROSS JOIN (
SELECT DATE(DATE_ADD('2016-08-08', pos - 1, "DAY")) AS calendar_day
FROM (
SELECT ROW_NUMBER() OVER() AS pos, *
FROM (FLATTEN((
SELECT SPLIT(RPAD('', 1 + DATEDIFF('2016-09-08', '2016-08-08'), '.'),'') AS h
FROM (SELECT NULL)),h
)))
) AS calendar
WHERE DATEDIFF(calendar_day, dt) BETWEEN 0 AND 29
)
GROUP BY calendar_day
ORDER BY calendar_day DESC
It assumes YourTable has userID and dt fields (like below for example)
dt userID
2016-09-08 1
2016-09-08 2
...
And you can control:
- reporting dates range by changing respectively 2016-08-08
and 2016-09-08
- aggregation size by changing 29
in BETWEEN 0 AND 29
来源:https://stackoverflow.com/questions/39399930/bigquery-is-it-possible-to-execute-another-query-inside-an-udf