Querying a Partitioned table in BigQuery using a reference from a joined table

血红的双手。 提交于 2021-02-19 03:39:19

问题


I would like to run a query that partitions table A using a value from table B. For example:

#standard SQL
select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where B.date = '2018-01-01'

This query will scan all the partitions in table A and will not take into consideration the date I specified in the where clause (for partitioning purposes). I have tried running this query in several different ways but all produced the same result - scanning all partitions in table A. Is there any way around it?

Thanks in advance.


回答1:


The doc says this about your use case:

Express the predicate filter as closely as possible to the table identifier. Complex queries that require the evaluation of multiple stages of a query in order to resolve the predicate (such as inner queries or subqueries) will not prune partitions from the query.

The following query does not prune partitions (note the use of a subquery):

#standardSQL
SELECT
  t1.name,
  t2.category
FROM
  table1 t1
INNER JOIN
  table2 t2
ON
  t1.id_field = t2.field2
WHERE
  t1.ts = (SELECT timestamp from table3 where key = 2)



回答2:


With BigQuery scripting (Beta now), there is a way to prune the partitions.

Basically, a scripting variable is defined to capture the dynamic part of a subquery. Then in subsequent query, scripting variable is used as a filter to prune the partitions to be scanned.

DECLARE date_filter ARRAY<DATETIME> 
  DEFAULT (SELECT ARRAY_AGG(date) FROM B WHERE ...);

select A.user_id
from my_project.xxx A
inner join my_project.yyy B
on A._partitiontime = timestamp(B.date)
where A._partitiontime IN UNNEST(date_filter)


来源:https://stackoverflow.com/questions/51611522/querying-a-partitioned-table-in-bigquery-using-a-reference-from-a-joined-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!