How can I monitor incurred BigQuery billings costs (jobs completed) by table/dataset in real-time?

后端 未结 2 1539
星月不相逢
星月不相逢 2021-01-17 02:35

The biggest chunk of my BigQuery billing comes from query consumption. I am trying to optimize this by understanding which datasets/tables consume the most.

I am the

2条回答
  •  北荒
    北荒 (楼主)
    2021-01-17 03:02

    It might be easier to use the INFORMATION_SCHEMA.JOBS_BY_* views because you don't have to set up the stackdriver logging and can use them right away.

    Example taken & modified from How to monitor query costs in Google BigQuery

    DECLARE gb_divisor INT64 DEFAULT 1024*1024*1024;
    DECLARE tb_divisor INT64 DEFAULT gb_divisor*1024;
    DECLARE cost_per_tb_in_dollar INT64 DEFAULT 5;
    DECLARE cost_factor FLOAT64 DEFAULT cost_per_tb_in_dollar / tb_divisor;
    
    SELECT
     ROUND(SUM(total_bytes_processed) / gb_divisor,2) as bytes_processed_in_gb,
     ROUND(SUM(IF(cache_hit != true, total_bytes_processed, 0)) * cost_factor,4) as cost_in_dollar,
     user_email,
    FROM (
      (SELECT * FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_USER)
      UNION ALL
      (SELECT * FROM `other-project.region-us`.INFORMATION_SCHEMA.JOBS_BY_USER)
    )
    WHERE
      DATE(creation_time) BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) and CURRENT_DATE()
    GROUP BY 
      user_email
    

    Some caveats:

    • you need to UNION ALL all of the projects that you use explicitly
    • JOBS_BY_USER did not work for me on my private account (supposedly because me login email is @googlemail and big query stores my email as @gmail`)
    • the WHERE condition needs to be adjusted for your billing period (instead of the last 30 days)
    • doesn't provide the "bytes billed" information, so we need to determine those based on the cache usage
    • doesn't include the "if less than 10MB use 10MB" condition
    • data is only retained for the past 180 days
    • DECLARE cost_per_tb_in_dollar INT64 DEFAULT 5; reflects only US costs - other regions might have different costs - see https://cloud.google.com/bigquery/pricing#on_demand_pricing
    • you can only query one region at a time

提交回复
热议问题