As it is not possible to update data within a table in BigQuery, and supports only append mechanism, I have decided to create new tables on monthly basis. So suppose for yea
2017 update:
With BigQuery #standardSQL - you can either use standard UNION ALL
to go through multiple tables, or you can use a *
to match all tables that share the same prefix. When using the *
matcher, you will also have access to the meta-column _TABLE_SUFFIX
- to know which table the rows came from.
SELECT * FROM Roster
UNION ALL
SELECT * FROM TeamMascot
One SQL query can reference multiple tables. Just separate each table with a comma in the FROM clause to query across all mentioned tables.
Here is a snippet demonstrating an example of the multiple table select:
SELECT trafficSource.medium AS Traffic_Source, COUNT(trafficSource.medium) AS Counts_Source
FROM [608XXXXX.ga_sessions_20131008],
[608XXXXX.ga_sessions_20131009],
[608XXXXX.ga_sessions_20131010],
[608XXXXX.ga_sessions_20131011],
[608XXXXX.ga_sessions_20131012],
[608XXXXX.ga_sessions_20131013],
[608XXXXX.ga_sessions_20131014],
[608XXXXX.ga_sessions_20131015],
GROUP BY Traffic_Source
ORDER BY Counts_Source DESC
You can also use a Table Wildcard Function. Here's one example from the docs for StandardSQL:
SELECT
name
FROM
mydata.people
WHERE
age >= 35
AND
(_TABLE_SUFFIX BETWEEN '20140325' AND '20140327')
And here's a similar example for LegacySQL (docs).
SELECT
name
FROM
(TABLE_DATE_RANGE([mydata.people],
TIMESTAMP('2014-03-25'),
TIMESTAMP('2014-03-27')))
WHERE
age >= 35
This will query the tables:
There are a few other options on the docs. I'd recommend checking them out.
Use a wildcard.
SELECT trafficSource.medium AS Traffic_Source, COUNT(trafficSource.medium) AS Counts_Source
FROM `608XXXXX.ga_sessions_201310*`
GROUP BY Traffic_Source
ORDER BY Counts_Source DESC