How to select distinct values in Google Bigquery?
Query:
SELECT DISTINCT cc_info
FROM user
WHERE date = ?
Thanks!
SELECT cc_info
FROM user
WHERE date = ?
GROUP BY cc_info
For all who have come to find the DISTINCT method in BigQuery, and who needs to use unique field feature for tables having large columns, using GROUP BY
as mentioned by tning won't be possible.
As of 2020, BigQuery has DISTINCT modifier. You need to wrap your query as:
SELECT DISTINCT usr.cc_info
FROM (
SELECT *
FROM user
WHERE date = ?
) usr
This could be very handy for people shifting from other SQL products.
Try using group by
SELECT cc_info
FROM user
WHERE date = ?
group by cc_info
SELECT COUNT(DISTINCT cc_info)
FROM user
WHERE date = ?
is NOT the right query, because DISTINCT
is a statistical approximation and is not guaranteed to be exact. See https://cloud.google.com/bigquery/docs/reference/legacy-sql#countdistinct
So better approach is
select EXACT_COUNT_DISTINCT(cc_info) from user where date = ?
Simply use group by,
SELECT cc_info
FROM user
WHERE date = ?
GROUP BY cc_info
If you want to COUNT over DISTINCT values you can use,
SELECT COUNT(DISTINCT cc_info)
FROM user
WHERE date = ?
This is another way to achieve your goal (in case anyone else needs it) which works in the current BigQuery 2020.
SELECT colname FROM table1
UNION DISTINCT
SELECT colname FROM table2
UNION DISTINCT
.
.
.
SELECT colname FROM tableN
My reference was this article.