问题
I'm trying to make sure I have a clear understanding of how my organisation gets billed for Google Cloud Platform Dataproc.
We have exported our billing history to BigQuery so that we can analyse it. This morning we had two dataproc clusters running and the screenshot below shows a subset of the billing history for those two clusters. I have filtered on labels.key = "goog-dataproc-cluster-uuid" or labels.key = "goog-dataproc-cluster-name" or labels.key = "goog-dataproc-location"
. Here is a subset of the results
I've drawn boxes around the costs for two kinds of sku. Lets's take a look at the Standard Intel N1 16 VCPU running in EMEA items.
I only have two clusters yet for each of those two clusters there are three lines. The reason is that there are three labels applied to each dataproc cluster, hence the costs 1.271852 & 3.815556 appear three times each.
My simple question then is...how do I get the total cost of my dataproc clusters? Do I add up all of these numbers (thus implying that the total cost is split equally over all of the labels) or do I take just one of the values (implying that the cost is repeated for each label)?
Here's another way of phrasing my question. Does this query give the total cost of running cluster data-dev-dataplatform-dataproc
for one day:
SELECT sum(cost)
FROM [dh-billing-179310:billing.gcp_billing_export_XXXXXXXX]
WHERE labels.key = "goog-dataproc-cluster-name"
and labels.value = "data-dev-dataplatform-dataproc"
and usage_start_time >= "2018-07-05 00:00:00"
and usage_end_time <= "2018-07-06 00:00:00"
or do I need to include other labels in order to get the total cost?
回答1:
In that flattened view of billing export data, the cost is repeated for each label; you should pick a single label value for any particular calculation. If you're trying to calculate the Dataproc total, it's probably most convenient to use one of the Dataproc-inserted "goog-dataproc-*" labels.
The idea here is that you can use different sets of labels to easily organize your total Dataproc-related costs attributed to any given subproject, so that you can then filter your billing queries along different dimensions.
来源:https://stackoverflow.com/questions/51213085/understanding-gcp-dataproc-billing-and-how-it-is-affected-by-labels