Skip to main content

Track your traces with Tempo

·10 mins
observability tempo alloy promtail grafana container kubernetes cncf helm prometheus logs traefik
Romain Boulanger
Author
Romain Boulanger
Infra/Cloud Architect with DevSecOps mindset
Table of Contents

Traces, a story of observability
#

The word observability is something you’re beginning to know by heart now, having read so many of my blog posts about it.

In the previous post, I talked about logs with Loki and Alloy. So it’s the next logical step to continue in this way by talking about traces!

Yes, but what is a trace?

Well, this is what allows us to visualise and understand the execution flow of a request, particularly in microservices architectures where a single user action can trigger dozens of calls between different services.

Each trace is made up of several segments called spans. These segments represent the different operations between components, providing a complete picture of the relationships between them and an insight into the different dependencies.

Trace and spans

Differences between traces and spans

The growing discussion around traces today is largely due to the adoption of standards like OpenTracing and later OpenTelemetry, which have significantly popularized application instrumentation for their collection, making this practice more accessible.

Leaving traces in your Kubernetes cluster can help you!
#

Kubernetes has become the environment most often mentioned when talking about Cloud Native applications, particularly when they are split into microservices. Quite simply because this orchestrator’s strengths lie in its ability to scale each component rapidly, thereby improving resilience and availability.

Here are a few points that, in my opinion, underline the essential role of having traces in a containerised world:

  • Identify bottlenecks: Traces help to identify precisely the services or calls that consume the most time in a complete transaction ;

  • Understand and identify dependencies: Traces reveal the relationships between different services and components, helping understand the overall architecture of an application;

  • Performance optimisation: When people talk about performance in relation to traces, they mainly identify latency. This gives you an idea of the component that takes the longest to respond to a request overall, so that you can take action to optimise it.

Moreover, within the Kubernetes ecosystem, additional components like Ingress Controllers, Gateway API, Service Mesh, and Network Policy also play a important role. These traces are crucial to understanding the path of a call through an application, but also through the various orchestrator mechanisms!

Tempo
#

Grafana’s Tempo is an open source tool from the Cloud Native Computing Foundation (CNCF), integrating perfectly with the Prometheus and Grafana world. It provides a single storage solution for traces in the form of backend.

CNCF with Tempo

Observability landscape within the CNCF in March 2025.

Tempo supports connections with object storage such as S3, Google Cloud Storage, Azure Blob Storage and even Minio for storing raw traces.

What’s more, it offers a wide range of formats: Jaeger, Zipkin and OpenTelemetry, under different protocols, making it easy to adopt in existing environments.

As mentioned above, it works very well with Grafana, making it easy to match metrics and logs to traces using trace identifiers (traceId).

The duration for which traces are retained can be customized through flexible retention policies, allowing organizations to adapt them to their specific needs.

Tempo’s architecture ensures that it can adapt to massive volumes of trace data without any loss of performance.

Finally, Tempo can be scaled according to your needs thanks to its distributed architecture, making it easy to improve performance when ingesting traces.

Architecture
#

Tempo’s architecture

Tempo architecture from official documentation

Tempo’s architecture is made up of several components:

  • Distributor will retrieve traces in different formats;
  • Ingester will be responsible for creating filters and indexes by organising the traces into blocks before storing them;
  • Query Frontend provides an API for retrieving traces, and it is this component that will be queried using Grafana;
  • Querier is located behind the Query Frontend and is used to retrieve traces using storage or the Ingester;
  • Compactor reduces storage space;
  • Metrics generator is an optional component that can be used to generate metrics from traces to populate Prometheus.

Installation modes
#

Tempo comes with two installation modes:

  • Monolithic (Single Binary): Quick to set up for testing purposes, or for small volumes of data;

  • Distributed: For large-scale deployments, Tempo is broken down into microservices (ingester, compactor, distributor, etc.), enabling much more granular scaling than is possible in the first method.

An operator exists: Tempo Operator for enthusiasts of this type of mechanism. It is based partly on the microservices mode described above.

After theory comes practice!
#

To follow the steps, you can retrieve the configurations via this code repository:

axinorm/traces-with-tempo

Set up and configure Tempo with Alloy and Grafana to look for traces in your Kubernetes cluster.

null
0
0

Feel free to adjust the values as needed to better suit your configuration.

These different values files will be used to configure Tempo and the various associated tools.

Let’s get started!

Alloy
#

Alloy goes much further than a simple log collector. It can also be used to retrieve traces via a dedicated configuration.

To do this, it is essential to inject the right configuration. For this use case, the http and grpc endpoints in OpenTelemetry format are required.

This configuration is completely flexible and depends on the format that your application or tool is capable of providing in order to retrieve the traces.

alloy:
  configMap:
    # -- Create a new ConfigMap for the config file.
    create: true
    # -- Content to assign to the new ConfigMap.  This is passed into `tpl` allowing for templating from values.
    content: |-
      logging {
        level = "info"
      }

      otelcol.receiver.otlp "default" {
        http {}
        grpc {}

        output {
          traces = [otelcol.processor.batch.default.input]
        }
      }

      otelcol.processor.batch "default" {
        output {
          traces = [otelcol.exporter.otlp.tempo.input]
        }
      }

      otelcol.exporter.otlp "tempo" {
        client {
          endpoint = "tempo-distributor.observability.svc.cluster.local:4317"
          tls {
            insecure = true
          }
        }
      }

Not to mention forwarding and exporting them to the Tempo service, deployed immediately afterwards.

By default, the Alloy collector exposes a service that needs to be overloaded to make the ports associated with the OpenTelemetry protocol (OTLP) available in both HTTP and GRPC.

  # -- Extra ports to expose on the Alloy container.
  extraPorts:
  - name: otlp-http
    port: 4318
    targetPort: 4318
    protocol: TCP
  - name: otlp-grpc
    port: 4317
    targetPort: 4317
    protocol: TCP

The configuration seems correct, so the next step is to deploy it.

Before you start, don’t forget to configure the source to retrieve the Helm charts from Grafana:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Then proceed with the installation:

helm install alloy grafana/alloy --version 0.12.5 --namespace observability --create-namespace --values ./values-alloy.yaml

Alloy is now ready to take on your events!

Tempo
#

To install Tempo with Helm, you can choose between two configurations:

  • A tempo chart in monolithic mode, a single binary to deploy;

  • Another chart decomposed into microservices: tempo-distributed ideal for managing the load granularly on each component.

To test Tempo’s operation in conditions close to reality, I’ve chosen the tempo-distributed chart. Feel free to choose the setup that suits you best.

In this configuration mode, object storage is strongly recommended. Personally, I use Minio, which is directly included in the form of a Helm sub-chart, with a bucket to store the traces:

# Minio
minio:
  enabled: true
  mode: standalone
  rootUser: grafana-tempo
  rootPassword: supersecret
  buckets:
    # Default Tempo storage bucket.
    - name: tempo-traces
      policy: none
      purge: false

Next, the OTLP protocols must be enabled for Tempo to be able to receive traces from Alloy:

traces:
  otlp:
    http:
      # -- Enable Tempo to ingest Open Telemetry HTTP traces
      enabled: true
    grpc:
      # -- Enable Tempo to ingest Open Telemetry GRPC traces
      enabled: true

Lastly, we complete the setup by configuring storage in S3 mode using the parameters of the deployed Minio.

# To configure a different storage backend instead of local storage:
storage:
  trace:
    # -- The supported storage backends are gcs, s3 and azure, as specified in https://grafana.com/docs/tempo/latest/configuration/#storage
    backend: s3
    wal:
      path: /tmp/tempo/wal
    s3:
      bucket: tempo-traces
      endpoint: tempo-minio:9000
      access_key: grafana-tempo
      secret_key: supersecret
      insecure: true
      tls_insecure_skip_verify: true

To deploy everything, use the helm install command that you know (almost) by heart:

helm install tempo grafana/tempo-distributed --version 1.32.7 --namespace observability --create-namespace --values ./values-tempo.yaml

Grafana
#

Grafana, the visualisation tool by far the best, benefits from a streamlined configuration, requiring you to initialise a datasource in order to manipulate Tempo data:

datasources:
  datasources.yaml:
    apiVersion: 1
    datasources:
    # Tempo DataSource
    - name: Tempo
      uid: tempo
      type: tempo
      url: http://tempo-query-frontend:3100/
      access: proxy
      orgId: 1

Here too, the same rules apply for deployment:

helm install grafana grafana/grafana --version 8.10.4 --namespace observability --create-namespace --values ./values-grafana.yaml

Traefik for testing
#

All that’s left is to plug in an application on which you want to view the traces. I use Traefik, a tool that can generate OTLP-format traces and dump them into a collector like Alloy.

Why Traefik?

Traefik is an Ingress Controller, in other words, the gateway for exposing my services to the outside world. I find it useful to be able to understand the routes and middleware called for each request. This helps me understand whether my configuration is correct or not.

To do this, Traefik’s configuration is light. The aim is to be able to read basic traces provided by Traefik.

To send the traces in OTLP format, here is an extract of the settings to adopt:

## Tracing
# -- https://doc.traefik.io/traefik/observability/tracing/overview/
tracing:  # @schema additionalProperties: false
  # -- Enables tracing for internal resources. Default: false.
  addInternals: true
  otlp:
    # -- See https://doc.traefik.io/traefik/v3.0/observability/tracing/opentelemetry/
    enabled: true
    http:
      # -- Set to true in order to send metrics to the OpenTelemetry Collector using HTTP.
      enabled: true
      # -- Format: <scheme>://<host>:<port><path>. Default: http://localhost:4318/v1/metrics
      endpoint: "http://alloy.observability:4318/v1/traces"

Unsurprisingly, we fill in the endpoint provided by Alloy associated with the HTTP protocol.

In addition, the addInternals: true enables all Traefik’s internal layers to be traced, very useful when configuring a set of middleware.

Finally, in order to generate traffic and have traces, we can use a service in NodePort mode to simplify the use case:

service:
  enabled: true
  ## -- Single service is using `MixedProtocolLBService` feature gate.
  ## -- When set to false, it will create two Service, one for TCP and one for UDP.
  type: NodePort

Once again, Helm is involved in deploying the Traefik chart:

helm install traefik traefik/traefik --version 34.4.1 --namespace ingress --create-namespace --values ./values-traefik.yaml

The tools are now correctly deployed and, above all, operational! As you can see:

$ kubectl -n observability get po
NAME                                   READY   STATUS    RESTARTS      AGE
alloy-d7649                            2/2     Running   0             72s
grafana-5b5dd98f75-dpdlm               1/1     Running   0             67s
tempo-compactor-5856cfc4b6-vf9sm       1/1     Running   3 (94s ago)   2m1s
tempo-distributor-776dd495cc-k2vdl     1/1     Running   3 (97s ago)   2m1s
tempo-ingester-0                       1/1     Running   3 (86s ago)   2m1s
tempo-ingester-1                       1/1     Running   3 (88s ago)   2m1s
tempo-ingester-2                       1/1     Running   3 (89s ago)   2m1s
tempo-memcached-0                      1/1     Running   0             2m1s
tempo-minio-568d558987-rvvjp           1/1     Running   0             2m1s
tempo-querier-6b7fb8f848-2ztqs         1/1     Running   3 (88s ago)   2m1s
tempo-query-frontend-685fcd8fb-fl8bg   1/1     Running   3 (97s ago)   2m1s
$ kubectl -n ingress get po
NAME                       READY   STATUS    RESTARTS   AGE
traefik-59c8dbcb57-9gmpt   1/1     Running   0          78s

View traces
#

Go to Grafana with a port-forward:

kubectl -n observability port-forward svc/grafana 8080:80

Generate a few traces by accessing the Traefik service and without further ado, go to the Explore section of Grafana and select Tempo as the DataSource.

Here’s an example of my personal configuration, obviously to get this result I’ve configured a few IngressRoutes and Middleware within Traefik:

Visualising traces in Grafana

A few final words
#

Tempo is the ideal tool for collecting traces of your applications. Thanks to its microservices-based architecture, it can take advantage of the scalability offered by Kubernetes while integrating seamlessly into the Grafana ecosystem.

The different formats and protocols enable Tempo to store a very wide range of data without having to use other third-party solutions.

Last but not least, it’s fairly quick to set up, thanks to ready-to-use documentation and Helm charts.

Related

Take back control of your logs with Loki
·12 mins
observability loki alloy promtail grafana container kubernetes cncf helm prometheus logs
Kargo, deploy from one environment to another with GitOps
·10 mins
gitops argocd container git kargo cd helm kustomize yaml
GitOps approach with Flamingo, the best of both worlds
·7 mins
gitops flamingo flux argocd cd kubernetes