Part of running a production-quality Kubernetes cluster is being able to monitor it. One of the nice things about having your Kubernetes cluster in Azure is that many things are easy to setup and work with right out of the box.
Metrics are one part of an effective monitoring strategy (read here for info about Logging to Azure from an AKS Cluster), but a very critical one. Metrics allow you to have insight into how your software and systems are running, from alerting to dashboards that look like this:
That dashboard looks great, but it’s just the visual representation of a few KQL queries against the logged custom/app metrics data!
This blog post will show you how to go from an AKS cluster to custom metrics.
How AKS does custom metrics
AKS utilizes Azure Monitor for containers. The part of Azure Monitor for containers that handles custom metrics is designed to consume Prometheus metrics from containers. The custom metrics themselves will be available in the Log Analytics workspace that you created (or is automatically created) for you when you create an AKS cluster with monitoring enabled.
The layer that exposes Prometheus metrics to Azure Monitor for containers is a Prometheus exporter. I wrote extensively about how to create a quick and easy Prometheus exporter. Once you have your pod exporting Prometheus metrics, you can now wire up the custom metrics.
Enable custom metrics
To start collecting metrics, there are a few steps that you need to take. First, you either need to create an AKS cluster with monitoring:
1 $ az aks create --resource-group rg --name aks --enable-addons monitoring
Or update your existing AKS cluster:
1 $ az aks enable-addons --resource-group rg --name aks --addons monitoring
Now you need to apply configuration for Azure Monitor for containers to start collecting metrics. The first step is to retrieve the configuration template, which is nothing more than a Kubernetes
1 $ curl -o monitoring-config.yaml https://aka.ms/container-azm-ms-agentconfig
If you look at this manifest, it is a little long but it is commented really well, so hopefully it can explain itself and how each option applies:
The part that we want to focus on is setting
true. This will tell Azure Monitor for containers to start scraping metrics. Once the configuration is modified locally, you just need to create the
ConfigMap in your AKS cluster:
1 $ kubectl apply -f ./monitoring-config.yaml
And that’s it! Now Azure Monitor for containers is scraping pods for Prometheus metrics. But… how does it know which pods to scrape? The answer is in the comments of the configuration (pasted below for reference). You configure your pods with annotations that include the following:
- prometheus.io/scrape: Enable scraping for this pod
- prometheus.io/scheme: If the metrics endpoint is secured then you will need to set this to
https& most likely set the tls config.
- prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
- prometheus.io/port: If port is not 9102 use this annotation
Those four annotations tell Azure Monitor for containers what and how to scrape metrics. So I’m going to modify my deployment manifest to reflect this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 kind: Deployment apiVersion: apps/v1 metadata: name: webapp spec: replicas: 8 selector: matchLabels: app: webapp template: metadata: labels: app: webapp annotations: prometheus.io/scrape: "true" prometheus.io/port: "9877" spec: containers: - name: webapp image: mycontainerregistry/webapp:latest imagePullPolicy: Always ports: - containerPort: 5000 name: http - name: exporter image: mycontainerregistry/webappexporter:latest imagePullPolicy: Always env: - name: POLLING_INTERVAL_SECONDS value: "5" - name: APP_PORT value: "5000" - name: EXPORTER_PORT value: "9877" ports: - containerPort: 9877 name: http
In my pod template, I added
prometheus.io/scrape: "true" and set the non-default port to scrape from with
Querying custom metrics
Now that we are collecting custom metrics in our Log Analytics workspace, it’s time to work with those metrics: Querying, dashboards, alerts, etc.
To query your metrics, you can use the Azure Data Explorer Web UI, and add your Log Analytics workspace with the following URL format:
Start writing your queries!! Here are a few from the dashboard above that I created:
All current and pending requests for the past hour
1 2 3 4 5 6 InsightsMetrics | where TimeGenerated > ago(1h) | where Namespace == "prometheus" | where Name in ("app_requests_current", "app_requests_pending") | summarize Total = sum(Val) by TimeGenerated, Name | sort by TimeGenerated asc, Name asc
Count of app health signals for the past hour
1 2 3 4 5 6 InsightsMetrics | where TimeGenerated between (_startTime .. _endTime) | where Namespace == "prometheus" | where Name == "app_health" | where Val == 0 | summarize Count = count()
1 2 3 4 5 6 InsightsMetrics | where Namespace == "prometheus" | where Name == "app_uptime" | summarize MinUptime = min(Val) by TimeGenerated | top 1 by TimeGenerated desc | project MinUptime
1 2 3 4 5 6 InsightsMetrics | where TimeGenerated > ago(1h) | where Namespace == "prometheus" | where Name == "process_virtual_memory_bytes" | summarize MaxMem = max(Val) by TimeGenerated | sort by TimeGenerated asc
Note: These above queries work for my custom metrics that I defined, so you would have to create your queries for your particular custom metrics.
Metrics are a key component of production software, and AKS makes it an easy task to collect, analyze, and visualize your metrics in a meaningful way.