CloudWatch does not aggregate across dimensions for your custom metrics

前端 未结 1 1410
心在旅途
心在旅途 2021-01-01 20:47

Reading the docs I saw this statement;

CloudWatch does not aggregate across dimensions for your custom metrics

That seems like

相关标签:
1条回答
  • 2021-01-01 21:21

    The docs are correct, CloudWatch won't aggregate across dimensions for your custom metrics (it will do so for some metrics published by other services, like EC2).

    This feature may seem useful and clear for your use-case but it's not clear how such aggregation would behave in a general case. CloudWatch allows for up to 10 dimensions so aggregating for all combinations of those may result in a lot of useless metrics, for all of which you would be billed. People may use dimensions to split their metrics between Test and Prod stacks for example, which are completely separate and aggregating those would not make sense.

    CloudWatch is treating a metric name plus a full set of dimensions as a unique metric identifier. In your case, this means that you need to publish your observations for each metric you want it contributing to separately.

    Let's say you have a metric named Latency, and you're putting a hostname in a dimension called Server. If you have three servers this will create three metrics:

    • Latency, Server=server1
    • Latency, Server=server2
    • Latency, Server=server3

    So the approach you mentioned in your question will work. If you also want a metric showing the data across all servers, each server would need to publish to a separate metric, which would be best to do by using a new common value for the Server dimension, something like AllServers. This will result in you having 4 metrics, like this:

    • Latency, Server=server1 <- only server1 data
    • Latency, Server=server2 <- only server2 data
    • Latency, Server=server3 <- only server3 data
    • Latency, Server=AllServers <- data from all 3 servers

    Update 2019-12-17

    Using metric math SEARCH function: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/using-metric-math.html

    This will give you per server latency and latency across all servers, without publishing a separate AllServers metric and if a new server shows up, it will be automatically picked up by the expression:

    Graph source:

    {
        "metrics": [
            [ { "expression": "SEARCH('{SomeNamespace,Server} MetricName=\"Latency\"', 'Average', 60)", "id": "e1", "region": "eu-west-1" } ],
            [ { "expression": "AVG(e1)", "id": "e2", "region": "eu-west-1", "label": "All servers", "yAxis": "right" } ]
        ],
        "view": "timeSeries",
        "stacked": false,
        "region": "eu-west-1"
    
    }
    

    Result will be a graph like this:

    Downsides of this approach:

    • Expressions are limited to 100 metrics.
    • Overall aggregation is limited to available metric math functions, which means percentiles are not available as of 2019-12-17.

    Using Contributor Insights (open preview as of 2019-12-17): https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContributorInsights.html

    If you publish your logs to CloudWatch Logs in JSON or Common Log Format (CLF), you can create rules that keep track of top contributors. For example, a rule that keeps track servers with latencies over 400 ms would look something like this:

    {
        "Schema": {
            "Name": "CloudWatchLogRule",
            "Version": 1
        },
        "AggregateOn": "Count",
        "Contribution": {
            "Filters": [
                {
                    "Match": "$.Latency",
                    "GreaterThan": 400
                }
            ],
            "Keys": [
                "$.Server"
            ],
            "ValueOf": "$.Latency"
        },
        "LogFormat": "JSON",
        "LogGroupNames": [
            "/aws/lambda/emf-test"
        ]
    }
    

    Result is a list of servers with most datapoints over 400 ms:

    Bringing it all together with CloudWatch Embedded Format: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format.html

    If you publish your data in CloudWatch Embedded Format you can:

    • Easily configure dimensions, so you can have per server metrics and overall metric if you want.
    • Use CloudWatch Logs Insights to query and visualise your logs.
    • Use Contributor Insights to get top contributors.
    0 讨论(0)
提交回复
热议问题