bigquery-standard-sql

BigQuery check for array overlap

血红的双手。 提交于 2019-12-02 07:27:15
问题 So I'm writing a BigQuery query and basically just need to be able to check if any of a number of strings are present as elements in one of the columns of the table, where the cared-about column itself contains arrays of strings. Just for context, I'm writing the query as part of a little automated Python job and am using standard SQL. I couldn't find anything that would explicitly check for array inclusion here: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and

BigQuery check for array overlap

守給你的承諾、 提交于 2019-12-02 05:56:17
So I'm writing a BigQuery query and basically just need to be able to check if any of a number of strings are present as elements in one of the columns of the table, where the cared-about column itself contains arrays of strings. Just for context, I'm writing the query as part of a little automated Python job and am using standard SQL. I couldn't find anything that would explicitly check for array inclusion here: https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators So I came up with a solution that employs a pretty hacky regex, specifically: ...other query

BigQuery: dynamically select table by the current date in Standard SQL?

半世苍凉 提交于 2019-12-01 01:45:23
I am trying to find the table of the current date of SELECT * FROM `da`.`m`.`ga_realtime_20190306` but not working SELECT * FROM `da`.`m`.`CONCAT('ga_realtime_', FORMAT_DATE('%Y%m%d', CURRENT_DATE())` How can I dynamically select a table with CURRENT_DATE and the BigQuery Standard query? Use wildcard and _TABLE_SUFFIX SELECT field1, field2, field3 FROM `my_dataset.ga_realtime_*` WHERE _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d', CURRENT_DATE()) 来源: https://stackoverflow.com/questions/57817881/bigquery-dynamically-select-table-by-the-current-date-in-standard-sql

How to convert stringified array into array in BigQuery?

最后都变了- 提交于 2019-11-30 12:29:46
It so happens I have a stringified array in a field in BigQuery '["a","b","c"]' and I want to convert it to an array that BigQuery understands. I want to be able to do this in standard SQL: with k as (select '["a","b","c"]' as x) select x from k, unnest(x) x I have tried JSON_EXTRACT('["a","b","c"]','$') and everythig else I could find online. Any ideas? Below is for BigQuery Standard SQL #standardSQL WITH k AS ( SELECT 1 AS id, '["a","b","c"]' AS x UNION ALL SELECT 2, '["x","y"]' ) SELECT id, ARRAY(SELECT * FROM UNNEST(SPLIT(SUBSTR(x, 2 , LENGTH(x) - 2)))) AS x FROM k It transforms your

How to convert stringified array into array in BigQuery?

不羁的心 提交于 2019-11-29 17:47:32
问题 It so happens I have a stringified array in a field in BigQuery '["a","b","c"]' and I want to convert it to an array that BigQuery understands. I want to be able to do this in standard SQL: with k as (select '["a","b","c"]' as x) select x from k, unnest(x) x I have tried JSON_EXTRACT('["a","b","c"]','$') and everythig else I could find online. Any ideas? 回答1: Below is for BigQuery Standard SQL #standardSQL WITH k AS ( SELECT 1 AS id, '["a","b","c"]' AS x UNION ALL SELECT 2, '["x","y"]' )

Query Failed Error: Resources exceeded during query execution: The query could not be executed in the allotted memory

杀马特。学长 韩版系。学妹 提交于 2019-11-28 03:53:34
问题 I am using Standard SQL.Even though its a basic query it is still throwing errors. Any suggestions pls SELECT fullVisitorId, CONCAT(CAST(fullVisitorId AS string),CAST(visitId AS string)) AS session, date, visitStartTime, hits.time, hits.page.pagepath FROM `XXXXXXXXXX.ga_sessions_*`, UNNEST(hits) AS hits WHERE _TABLE_SUFFIX BETWEEN "20160801" AND "20170331" ORDER BY fullVisitorId, date, visitStartTime 回答1: The only way for this query to work is by removing the ordering applied in the end:

BigQuery: Deleting Duplicates in Partitioned Table

﹥>﹥吖頭↗ 提交于 2019-11-27 14:11:45
问题 I have BQ table that is partitioned by insert time. I'm trying to remove duplicates from the table. These are true duplicates: for 2 duplicate rows, all columns are equal - of course having a unique key might have been helpful :-( At first I tried a SELECT query to enumerate duplicates and remove them: SELECT * EXCEPT(row_number) FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY id_column) row_number FROM `mytable`) WHERE row_number = 1 This results in unique rows but creates a new table that

Rolling 90 days active users in BigQuery, improving preformance (DAU/MAU/WAU)

六月ゝ 毕业季﹏ 提交于 2019-11-27 09:10:49
I'm trying to get the number of unique events on a specific date, rolling 90/30/7 days back. I've got this working on a limited number of rows with the query bellow but for large data sets I get memory errors from the aggregated string which becomes massive. I'm looking for a more effective way of achieving the same result. Table looks something like this: +---+------------+-------------+ | | date | userid | +---+------------+-------------+ | 1 | 2013-05-14 | xxxxx | | 2 | 2017-03-14 | xxxxx | | 3 | 2018-01-24 | xxxxx | | 4 | 2013-03-21 | xxxxx | | 5 | 2014-03-19 | xxxxx | | 6 | 2015-09-03 |

Rolling 90 days active users in BigQuery, improving preformance (DAU/MAU/WAU)

本小妞迷上赌 提交于 2019-11-26 14:25:45
问题 I'm trying to get the number of unique events on a specific date, rolling 90/30/7 days back. I've got this working on a limited number of rows with the query bellow but for large data sets I get memory errors from the aggregated string which becomes massive. I'm looking for a more effective way of achieving the same result. Table looks something like this: +---+------------+-------------+ | | date | userid | +---+------------+-------------+ | 1 | 2013-05-14 | xxxxx | | 2 | 2017-03-14 | xxxxx