Querying multiple tables in Big Query

后端 未结 5 492
小蘑菇
小蘑菇 2021-01-04 02:10

As it is not possible to update data within a table in BigQuery, and supports only append mechanism, I have decided to create new tables on monthly basis. So suppose for yea

相关标签:
5条回答
  • 2021-01-04 02:50

    2017 update:

    With BigQuery #standardSQL - you can either use standard UNION ALL to go through multiple tables, or you can use a * to match all tables that share the same prefix. When using the * matcher, you will also have access to the meta-column _TABLE_SUFFIX - to know which table the rows came from.

    SELECT * FROM Roster
    UNION ALL
    SELECT * FROM TeamMascot
    
    0 讨论(0)
  • 2021-01-04 02:57

    One SQL query can reference multiple tables. Just separate each table with a comma in the FROM clause to query across all mentioned tables.

    0 讨论(0)
  • 2021-01-04 03:01

    Here is a snippet demonstrating an example of the multiple table select:

    SELECT trafficSource.medium AS Traffic_Source, COUNT(trafficSource.medium) AS Counts_Source
    FROM [608XXXXX.ga_sessions_20131008],
    [608XXXXX.ga_sessions_20131009],
    [608XXXXX.ga_sessions_20131010],
    [608XXXXX.ga_sessions_20131011],
    [608XXXXX.ga_sessions_20131012],
    [608XXXXX.ga_sessions_20131013],
    [608XXXXX.ga_sessions_20131014],
    [608XXXXX.ga_sessions_20131015],
    GROUP BY Traffic_Source
    ORDER BY Counts_Source DESC
    
    0 讨论(0)
  • 2021-01-04 03:02

    You can also use a Table Wildcard Function. Here's one example from the docs for StandardSQL:

    SELECT 
      name
    FROM 
      mydata.people
    WHERE 
      age >= 35
      AND
      (_TABLE_SUFFIX BETWEEN '20140325' AND '20140327')
    

    And here's a similar example for LegacySQL (docs).

    SELECT 
      name
    FROM 
      (TABLE_DATE_RANGE([mydata.people], 
                    TIMESTAMP('2014-03-25'), 
                    TIMESTAMP('2014-03-27'))) 
    WHERE 
      age >= 35
    

    This will query the tables:

    • mydata.people20140325
    • mydata.people20140326
    • mydata.people20140327

    There are a few other options on the docs. I'd recommend checking them out.

    0 讨论(0)
  • 2021-01-04 03:08

    Standard SQL.

    Use a wildcard.

    SELECT trafficSource.medium AS Traffic_Source, COUNT(trafficSource.medium) AS Counts_Source
    FROM `608XXXXX.ga_sessions_201310*`
    GROUP BY Traffic_Source
    ORDER BY Counts_Source DESC
    
    0 讨论(0)
提交回复
热议问题