SQL - Unequal left join BigQuery

£可爱£侵袭症+ 提交于 2019-12-04 05:51:12

问题


New here. I am trying to get the Daily and Weekly active users over time. they have 30 days before they are considered inactive. My goal is to create graph's that can be split by user_id to show cohorts, regions, categories, etc.

I have created a date table to get every day for the time period and I have the simplified orders table with the base info that I need to calculate this.

I am trying to do a Left Join to get the status by date using the following SQL Query:

WITH daily_use AS (
        SELECT
          __key__.id AS user_id
          , DATE_TRUNC(date(placeOrderDate), day) AS activity_date
        FROM `analysis.Order`
        where isBuyingGroupOrder = TRUE 
          AND testOrder = FALSE
        GROUP BY 1, 2
 ),
dates AS (
        SELECT DATE_ADD(DATE "2016-01-01", INTERVAL d.d DAY) AS date
        FROM
          (
           SELECT ROW_NUMBER() OVER(ORDER BY __key__.id) -1 AS d
           FROM `analysis.Order`
           ORDER BY __key__.id
           LIMIT 1096
          ) AS  d
        ORDER BY 1 DESC
      )

SELECT
      daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd

LEFT JOIN daily_use
    ON wd.date >= daily_use.activity_date
    AND wd.date < DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)

GROUP BY 1,2

I am getting this Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join. In BigQuery and was wondering how can I go around this. I am using Standard SQL within BigQuery.

Thank you


回答1:


Below is for BigQuery Standard SQL and mostly reproduce logic in your query with exception of not including days where no activity at all is found

#standardSQL
SELECT
    daily_use.user_id
  , wd.date AS DATE
  , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd
CROSS JOIN daily_use
WHERE wd.date BETWEEN 
  daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
GROUP BY 1,2
-- ORDER BY 1,2

if for whatever reason you still need to exactly reproduce your logic - you can embrace above with final left join as below:

#standardSQL
SELECT *
FROM dates AS wd
LEFT JOIN (
  SELECT
    daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
  FROM dates AS wd
  CROSS JOIN daily_use
  WHERE wd.date BETWEEN 
    daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
  GROUP BY 1,2
) AS daily_use
USING (date)
-- ORDER BY 1,2


来源:https://stackoverflow.com/questions/46370728/sql-unequal-left-join-bigquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!