问题
I am trying to create a custom dashboard with some specific data for the AKS cluster. What i would like to do is assemble a dashboard with a graph of the RAM and CPU usage per selected controllers and nodes, and if possible number of restarts per pod. How can i create a custom graphs with the controllers average resource usage ?
回答1:
You can click on "Logs" link on the left on your AKS cluster blade in Azure Portal (make sure you have Insights enabled by clicking on "Insights" first - if it is Ok, you'll see charts close to what you want otherwise, you'll see onboarding instructions).
Use the following query to chart CPU utilization (95th %-tile) for all containers in a given controller:
let endDateTime = now();
let startDateTime = ago(14d);
let trendBinSize = 1d;
let capacityCounterName = 'cpuLimitNanoCores';
let usageCounterName = 'cpuUsageNanoCores';
let clusterName = 'coin-test-i';
let controllerName = 'kube-svc-redirect';
KubePodInventory
| where TimeGenerated < endDateTime
| where TimeGenerated >= startDateTime
| where ClusterName == clusterName
| where ControllerName == controllerName
| extend InstanceName = strcat(ClusterId, '/', ContainerName),
ContainerName = strcat(controllerName, '/', tostring(split(ContainerName, '/')[1]))
| distinct Computer, InstanceName, ContainerName
| join hint.strategy=shuffle (
Perf
| where TimeGenerated < endDateTime
| where TimeGenerated >= startDateTime
| where ObjectName == 'K8SContainer'
| where CounterName == capacityCounterName
| summarize LimitValue = max(CounterValue) by Computer, InstanceName, bin(TimeGenerated, trendBinSize)
| project Computer, InstanceName, LimitStartTime = TimeGenerated, LimitEndTime = TimeGenerated + trendBinSize, LimitValue
) on Computer, InstanceName
| join kind=inner hint.strategy=shuffle (
Perf
| where TimeGenerated < endDateTime + trendBinSize
| where TimeGenerated >= startDateTime - trendBinSize
| where ObjectName == 'K8SContainer'
| where CounterName == usageCounterName
| project Computer, InstanceName, UsageValue = CounterValue, TimeGenerated
) on Computer, InstanceName
| where TimeGenerated >= LimitStartTime and TimeGenerated < LimitEndTime
| project Computer, ContainerName, TimeGenerated, UsagePercent = UsageValue * 100.0 / LimitValue
| summarize P95 = percentile(UsagePercent, 95) by bin(TimeGenerated, trendBinSize) , ContainerName
| render timechart
Replace cluster name and controller name with the ones you want. You can also play with start/end time parameters, bin sizes, max/min/avg in place of 95th %-tile.
For memory metrics replace metric names with:
let capacityCounterName = 'memoryLimitBytes';
let usageCounterName = 'memoryRssBytes';
来源:https://stackoverflow.com/questions/54569778/azure-aks-monitoring-custom-dashboard-resources