问题
I have two related tables (unnecessary columns not listed):
LOCATION
VENUE_ID - NUMBER(38,0)
VISIT
ID - NUMBER(38,0)
VENUE_ID - NUMBER(38,0)
DEVICE_ID - VARCHAR(16777216)
The tables are related such that visits are associated with a location via VENUE_ID
.
I'm attempting to get the count of unique device ids by location, so I wrote the following query:
SELECT "d"."VENUE_ID"
, (
SELECT COUNT(*)
FROM (
SELECT DISTINCT "f0"."DEVICE_ID"
FROM "MAIN"."VISIT" AS "f0"
WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
) AS "t")
FROM "MAIN"."LOCATION" AS "d"
Unfortunately, this query resulted in the cryptic error SQL compilation error: Unsupported subquery type cannot be evaluated
.
Through a bit of experimentation, I've found that I can get the query to return without error, but only if I add an additional (useless) subquery prior to the existing one in the SELECT:
SELECT "d"."VENUE_ID"
-- New Useless Subquery
, (
SELECT COUNT(*)
FROM "MAIN"."VISIT" AS "f"
WHERE "d"."VENUE_ID" = "f"."VENUE_ID")
--
, (
SELECT COUNT(*)
FROM (
SELECT DISTINCT "f0"."DEVICE_ID"
FROM "MAIN"."VISIT" AS "f0"
WHERE "d"."VENUE_ID" = "f0"."VENUE_ID"
) AS "t")
FROM "MAIN"."LOCATION" AS "d"
If I move the new subquery to anywhere in the select after the distinct subquery, the error returns. I've reviewed the documentation on subqueries in Snowflake and either I am not understanding how that applies to my query here or I'm facing undocumented behavior. Anyone have any idea what's going on here?
回答1:
I think you're making this more complex than this needs to be. Below should be all you need:
SELECT l.venue_id
, count(distinct v.device_id)
FROM location l
LEFT JOIN visit v
on l.venue_id = v.venue_id
GROUP BY l.venue_id
回答2:
The answer is a little cryptic, but what happens is this:
You are asking for ONE value and you need to guarantee that only ONE value is returned by your subquery. A distinct clause cannot guarantee that. In some databases that will work as long as the data returns one row, but the moment you get two rows then the database will throw an error.
Snowflake is strict on its subquery analysis. So you need to use a subquery that is guarantee to return always one value, for example select sum(..), select count(..)
来源:https://stackoverflow.com/questions/64578561/snowflake-unsupported-subquery-for-distinct-column-order-matters