Snowflake: Unsupported subquery for DISTINCT - Column order matters?

走远了吗. 提交于 2020-12-26 04:25:39

问题


I have two related tables (unnecessary columns not listed):

LOCATION

VENUE_ID - NUMBER(38,0)

VISIT

ID - NUMBER(38,0)
VENUE_ID - NUMBER(38,0)
DEVICE_ID - VARCHAR(16777216)

The tables are related such that visits are associated with a location via VENUE_ID.

I'm attempting to get the count of unique device ids by location, so I wrote the following query:

SELECT "d"."VENUE_ID"
    , (
      SELECT COUNT(*)
      FROM (
          SELECT DISTINCT "f0"."DEVICE_ID"
          FROM "MAIN"."VISIT" AS "f0"
          WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
      ) AS "t")
FROM "MAIN"."LOCATION" AS "d"

Unfortunately, this query resulted in the cryptic error SQL compilation error: Unsupported subquery type cannot be evaluated.

Through a bit of experimentation, I've found that I can get the query to return without error, but only if I add an additional (useless) subquery prior to the existing one in the SELECT:

SELECT "d"."VENUE_ID"

    -- New Useless Subquery
    , (
      SELECT COUNT(*)
      FROM "MAIN"."VISIT" AS "f"
      WHERE "d"."VENUE_ID" = "f"."VENUE_ID")
    --

    , (
      SELECT COUNT(*)
      FROM (
          SELECT DISTINCT "f0"."DEVICE_ID"
          FROM "MAIN"."VISIT" AS "f0"
          WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
      ) AS "t")
FROM "MAIN"."LOCATION" AS "d"

If I move the new subquery to anywhere in the select after the distinct subquery, the error returns. I've reviewed the documentation on subqueries in Snowflake and either I am not understanding how that applies to my query here or I'm facing undocumented behavior. Anyone have any idea what's going on here?


回答1:


I think you're making this more complex than this needs to be. Below should be all you need:

SELECT l.venue_id
  , count(distinct v.device_id)
FROM location l
LEFT JOIN visit v
 on l.venue_id = v.venue_id
GROUP BY l.venue_id



回答2:


The answer is a little cryptic, but what happens is this:

You are asking for ONE value and you need to guarantee that only ONE value is returned by your subquery. A distinct clause cannot guarantee that. In some databases that will work as long as the data returns one row, but the moment you get two rows then the database will throw an error.

Snowflake is strict on its subquery analysis. So you need to use a subquery that is guarantee to return always one value, for example select sum(..), select count(..)



来源:https://stackoverflow.com/questions/64578561/snowflake-unsupported-subquery-for-distinct-column-order-matters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!