Dynamic Column Names in BigQuery SQL Query

…衆ロ難τιáo~ 提交于 2021-02-11 13:58:10

问题


I have a BigQuery table in which every row is a visit of a user in a country. The schema is something like this:

UserID   |   Place   |   StartDate   |   EndDate   | etc ...
---------------------------------------------------------------
134      |  Paris    |   234687432   |   23648949  | etc ...
153      |  Bangkok  |   289374897   |   2348709   | etc ...
134      |  Paris    |   9287324892  |   3435438   | etc ...

The values of the "Place" columns can be no more than tens of options, but I don't know them all in advance.

I want to query this table so that in the resulted table the columns are named as all the possibilities of the Place column, and the values are the total number of visits per user in this place. The end result should look like this:

UserID | Paris | Bangkok | Rome | London | Rivendell | Alderaan 
----------------------------------------------------------------
134    |  2    |  0      |  0   |  0     |  0        |  0 
153    |  0    |  1      |  0   |  0     |  0        |  0

I guess I can select all the possible values of "Place" with SELECT DISTINCT but how can I achieve this structure of result table?

Thanks


回答1:


Below is for BigQuery Standard SQL

Step 1 - dynamically assemble proper SQL statement with all possible values of "place" field

#standardSQL
SELECT '''
SELECT UserID,''' || STRING_AGG(DISTINCT
  ' COUNTIF(Place = "' || Place || '") AS ' || REPLACE(Place, ' ', '_')
) || ''' FROM `project.dataset.table`
GROUP BY UserID
'''
FROM `project.dataset.table`

Note: you will get one row output with the text like below (already split in multiple rows for better reading

SELECT UserID, 
COUNTIF(Place = "Paris") AS Paris, 
COUNTIF(Place = "Los Angeles") AS Los_Angeles 
FROM `project.dataset.table` 
GROUP BY UserID

Note; I replaced Bangkok with Los Angeles so you see why it is important to replace possible spaces with underscores

Step 2 - just copy output text of Step 1 and simply run it

Obviously you can automate above two steps using any client of your choice




回答2:


If you just want to count the places, you can use countif():

select userid,
       countif(place = 'Paris') as paris,
       countif(place = 'Bangkok') as bangkok,
       countif(place = 'Rome') as rome,
       . . .
from t
group by userid;


来源:https://stackoverflow.com/questions/61710854/dynamic-column-names-in-bigquery-sql-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!