02.08.2024 - Mikel Jason Münnekhoff, Sherief Ahmed - 8 min read Part 3: Enabling annotation-based scraping Transitioning from Prometheus to OpenTelemetry - A Journey of a Cluster's Metrics Evolution

In the previous posts of this series, we introduced OpenTelemetry in general and configured a first version of an OpenTelemetry collector to scrape metrics via auto-discovery based on ServiceMonitor and PodMonitor custom resource definitions (CRDs). If you haven’t already, we recommend checking out these blog posts first:

While using ServiceMonitor and PodMonitor CRDs should cover the vast majority of use cases and it’s the recommended way to use these for your own applications, there are other approaches to scrape metrics. In this blogpost, we will take a look at the publicly spread way of using annotation-based scraping.

What is annotation-based scraping?

Instead of using CRDs, you can use pod annotations to mark a pod as a scrape target offering metrics. Annotation-based scraping as a first-class feature of Prometheus was requested for the Prometheus operator in 2018. One of the Prometheus team members immediately objected because of severe limitations compared to the already implemented auto-discovery feature. The enhancement proposal was open until 2021, and engineers discussed pros and cons of this approach. While the feature did not make it to implementation in Prometheus itself, and OpenTelemetry’s target allocator also does not support it out of the box, there are common ways to implement this via scrape configuration. For example, the Prometheus community has implemented annotation-based scraping as part of their Prometheus helm chart (see its README and their actual implementation).

While prometheus.io annotations are not a core part of Prometheus, they have become the de facto standard to configure scraping.

The discussion of the initial enhancement proposal already contained what is now used, in some open projects, as a de-facto standardization. The idea is to annotate pods with the information necessary to identify a scrape target without additional resources. These annotations are:

  • prometheus.io/scrape: "true": This feature toggle is set to enable scraping.
  • prometheus.io/port: 8080: This annotation sets the port which the metrics can be scraped from.
  • prometheus.io/path: /metrics: This value defines the HTTP path for the metrics endpoint. This annotation is optional, its default value is /metrics.
  • prometheus.io/scheme: https: This toggle can be used to advertise the metrics endpoint as TLS-encrypted.

How to add annotation-based scraping to the OTelCol prometheus receiver?

While the OTelCol Prometheus receiver declares to lack some features to reach full feature parity with real Prometheus, it supports all scrape configuration options. This allows us to implement annotation-based scraping just as the community has done it in their Prometheus chart. Let’s dive into this entry point, scrape_config. If you want to follow along with the official documentation, we will start here.

Let’s setup the scraping configuration. To do so, we use pod metadata and transform it to a valid metrics endpoint description. As explained in the Prometheus documentation, we focus on three metrics labels with special meanings:

  • __address__ must contain the host and port of the scrape target. It is populated with the discovered address of the pod and the port given by the annotation.
  • __metrics_path__ must contain the URL path of the metrics endpoint. Its value comes directly from the corresponding annotation.
  • __scheme__ defines whether the metrics endpoint is expected to be TLS-encrypted. It’s value comes directly from the corresponding annotation.

To get access to the pod’s annotations, we use the kubernetes_sd_configs configuration in combination with role: pod. This allows us to read a lot of pod metadata (see docs), including labels and annotations. For these, unsupported characters like dots and slashes are translated to underscores. With the defined prefix, the pod label prometheus.io/scrape can be read as __meta_kubernetes_pod_annotation_prometheus_io_scrape. At the end of all relabelings, all labels prefixed with __ will be disregarded, so none of the metadata values will attached to the metrics unless explicitly configured.

receivers:
  prometheus:
    config:
      scrape_configs:
        [...]
        - job_name: 'annotation-discovery'
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: "true" # keep only with fixed value "true"
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+) # extract the endpoint's path
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
            action: replace
            regex: (https?) # use https if annotated
            target_label: __scheme__
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $$1:$$2 # adjust the address to contain both in-cluster domain and port
            target_label: __address__

To know where the collected metrics come from, we want to add some metadata to them. We can easily do this by using the same approach and translating the pod metadata to (this time non-special) metrics labels. Here, we enrich our metrics with the pod name and namespace as metadata.

[...]
  relabel_configs:
    [...]
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: namespace
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: pod

As explained in part 2, we use the Prometheus exporter to expose all the metrics collected by an OpenTelemetryCollector to validate our successful implementation. For the metrics to collect, we use a small trick. Annotations of the OpenTelemetryCollector resource are propagated to various subsequent resources (as of opentelemetry-operator v0.90.0). Hence, we can annotate the custom resource to set our annotations to the created pods and make the collector discover itself. For demo purposes, we implement one other tweak: We take what we have learnt about annotation-based scraping and isolate the demo setup by not using prometheus.io/scrape: "true", but prometheus.io/scrape: "demo". This way, we don’t interfere with other resources that might be present in a demo cluster in either way - we don’t scrape anything else than what we set up and nothing present scrapes this demo.

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: o11y-metrics
  namespace: o11y-metrics
  annotations: # these are propagated to the collector pods
    prometheus.io/scrape: "demo" # this is for demo only, usually "true"
    prometheus.io/port: "8888" # this is the collector's default metrics port, not the exporter
    # prometheus.io/path is not needed, /metrics is assumed if not present
spec:
  mode: statefulset # technically not required here, but when using in combination with part 2 and CRD-based scraping
  config: |
    receivers:
      prometheus:
        config:
          scrape_configs:
            - job_name: annotation-discovery
              kubernetes_sd_configs:
              - role: pod
              relabel_configs:
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
                action: keep
                regex: "demo" # this is for demo only, usually "true"
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
                action: replace
                target_label: __metrics_path__
                regex: (.+)
              - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
                action: replace
                regex: (https?)
                target_label: __scheme__
              - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
                action: replace
                regex: ([^:]+)(?::\d+)?;(\d+)
                replacement: $$1:$$2
                target_label: __address__
              - source_labels: [__meta_kubernetes_namespace]
                action: replace
                target_label: namespace
              - source_labels: [__meta_kubernetes_pod_name]
                action: replace
                target_label: pod

    exporters:
      prometheus:
        endpoint: 0.0.0.0:8080

    service:
      pipelines:
        metrics:
          receivers: [prometheus]
          exporters: [prometheus]    

To allow the collector to identify matching pods, it must be allowed to discover them. Therefore, we reuse the RBAC permission resources from the previous blog post. Obviously, the OpenTelemetry Operator is also required. With these in place, we can verify our results. First, we port-forward the collector’s exporter port:

$ kubectl -n o11y-metrics port-forward o11y-metrics-collector-0 8080:8080

Then, we can see what the collector identified and received:

$ curl http://localhost:8080/metrics 2>/dev/null | grep -E '^up' -B4 -A4
# TYPE target_info gauge
target_info{http_scheme="http",instance="100.124.168.27:8888",job="annotation-discovery",k8s_container_name="otc-container",k8s_namespace_name="o11y-metrics",k8s_node_name="ip-100-124-104-131.eu-central-1.compute.internal",k8s_pod_name="o11y-metrics-collector-0",k8s_pod_uid="b6b50e04-2d07-43de-9a97-a493716666ac",k8s_statefulset_name="o11y-metrics-collector",net_host_name="100.124.168.27",net_host_port="8888"} 1
# HELP up The scraping was successful
# TYPE up gauge
up{instance="100.124.168.27:8888",job="annotation-discovery",namespace="o11y-metrics",pod="o11y-metrics-collector-0"} 1

We search for the up metric. This is a metric defined by the Prometheus project, which indicates each identified target and whether it is healthy/reachable/scrapable or not. We see that our pod o11y-metrics-collector-0 is successfully discovered by our new job annotation-discovery. You can read about up and similar metrics and labels in the Prometheus docs. Additionally, we use grep to see one other metric, which was collected from the metrics endpoint. This is indicated by the instance label. Note that it comes from port 8888, which we annotated the pod with, but we port-forwarded to port 8080.

With this verification of scraped metrics, we can conclude annotation-based metrics discovery and scraping has been successfully implemented. We have shown how we can make annotating pods a valid way to integrate workloads in a cluster’s observability stack. Beyond pods, this approach can be extended to other Kubernetes resources. Prometheus’ kubernetes_sd_config provides options for node, service, pod, endpoint, endpointslice, and ingress resources. You can use the same approach to transform these resources to scrape targets, and for example make annotating a service a supported mechanism for defining scrape targets in your metrics stack.

By supporting annotation-based scraping, your observability stack is set up to support both well-known and widely spread approaches to monitor applications. When leveraging open-source software and Kubernetes manifests, this allows for seamless integration and reduces the need for adjustments such as creating additional custom resources. For example, if you search for such annotations in bitnami’s chart repository on GitHub, you get more than 80 hits, including charts for cilium, node exporter, etcd, metallb, consul, vault, sonarqube, fluentd and many more. Another advantage of using annotations is that they don’t block rollouts when you don’t have an observability stack in place, e.g. when developing in a local cluster. While ServiceMonitors and PodMonitors require the respective CRDs to be installed in order not to block a helm chart, annotations can simply exist in any cluster. Yet, using a ServiceMonitor offers other advantages, such as adding additional relabeling rules, or simply scraping two endpoints from the same pod (you cannot have two values for one annotation, but image running a sidecar). Using the knowledge you have gained, you will have to decide on a case-by-case basis which approach makes more sense in your specific situation.

Now that we have two ways to identify scrape targets, we will take a look at optimizing traffic in the next part of this series. See you next time!

Credits

Title image by nuchao on iStock

Mikel Jason Münnekhoff

Senior Technical Consultant

Sherief Ahmed

Lead Consultant