How to simulate a pivot table with BigQuery?

前端 未结 2 976
南笙
南笙 2020-12-06 07:28

I need to organize the results of a query in columns, as if it was a pivot table. How can I do that?

相关标签:
2条回答
  • 2020-12-06 08:07

    2020 update: fhoffa.x.pivot()

    • https://towardsdatascience.com/easy-pivot-in-bigquery-one-step-5a1f13c6c710

    Use conditional statements to organize the results of a query into rows and columns. In the example below, results from a search for most revised Wikipedia articles that start with the value 'Google' are organized into columns where the revision counts are displayed if they meet various criteria.

    SELECT
      page_title,
      /* Populate these columns as True or False, depending on the condition */
      IF(page_title CONTAINS 'search', INTEGER(total), 0) AS search,
      IF(page_title CONTAINS 'Earth' OR page_title CONTAINS 'Maps', INTEGER(total), 0) AS geo,
    FROM
      /* Subselect to return top revised Wikipedia articles containing 'Google'
       * followed by additional text.
       */
      (SELECT
        TOP(title, 5) as page_title,
        COUNT(*) as total
       FROM
         [publicdata:samples.wikipedia]
       WHERE
         REGEXP_MATCH (title, r'^Google.+') AND wp_namespace = 0
      );
    

    Result:

    +---------------+--------+------+
    |  page_title   | search | geo  |
    +---------------+--------+------+
    | Google search |   4261 |    0 |
    | Google Earth  |      0 | 3874 |
    | Google Chrome |      0 |    0 |
    | Google Maps   |      0 | 2617 |
    | Google bomb   |      0 |    0 |
    +---------------+--------+------+
    

    A similar example, without using a subquery:

    SELECT SensorType, DATE(DTimestamp), AVG(data) avg, 
    FROM [data-sensing-lab:io_sensor_data.moscone_io13]
    WHERE DATE(DTimestamp) IN ('2013-05-16', '2013-05-17')
    GROUP BY 1, 2
    ORDER BY 2, 3 DESC;
    

    Generates a 3 column table: sensor type, date, and avg data. To "pivot" and have the dates as columns:

    SELECT
      SensorType,
      AVG(IF(DATE(DTimestamp) = '2013-05-16', data, null)) d16,
      AVG(IF(DATE(DTimestamp) = '2013-05-17', data, null)) d17
    FROM [data-sensing-lab:io_sensor_data.moscone_io13]
    GROUP BY 1
    ORDER BY 2 DESC;
    
    0 讨论(0)
  • 2020-12-06 08:31

    Same approach/result, but using BigQuery Standard SQL:

    -- top revised Wikipedia articles containing 'Google'
    WITH articles AS (
      SELECT title AS page_title,
             COUNT(*) AS total
        FROM `publicdata.samples.wikipedia`
       WHERE REGEXP_CONTAINS(title, r'^Google.+') AND wp_namespace = 0
       GROUP BY title
       ORDER BY total DESC
       LIMIT 5
    )
    
    SELECT page_title,
           -- Populate these columns as True or False, depending on the condition
           IF(page_title LIKE '%search%', total, 0) AS search,
           IF(page_title LIKE '%Earth%' OR page_title LIKE '%Maps%', total, 0) AS geo
      FROM articles
    ;
    
    0 讨论(0)
提交回复
热议问题