Observability

Observability in Distributed Systems: From Monitoring to Understanding

The Exitless Maze: A New Analogy for Observability Imagine you are a brilliant architect, responsible for a huge and intricate building, full of complex systems: heating, ventilation, lighting, security, elevators. You have installed sensors everywhere: every temperature, every pressure, every watt of energy consumed is recorded. Your control dashboards are a profusion of charts and data, every parameter is monitored to perfection, every line is green and reassuring. You know exactly what is happening in every single corner of the building.

Tuesday, July 29, 2025 | 21 minutes Read

OpenTelemetry: Anatomy of Observability in Distributed Systems

The Pre-OpenTelemetry Fragmentation Problem Before the advent of OpenTelemetry, the observability ecosystem was a maze of protocols, APIs, and proprietary formats. Each vendor had developed their own “dialect”: Jaeger used its own span format and ingestion protocol. Zipkin had a different data model and specific REST APIs. Prometheus required metrics in a specific format with rigid naming conventions. AWS X-Ray, Google Cloud Trace, Azure Monitor - each with proprietary SDKs. This fragmentation created systemic problems:

Tuesday, July 29, 2025 | 10 minutes Read

The LGTM Stack and OpenTelemetry: Complete Observability for Your Distributed Systems

We have explored the principles of observability and the fundamental role of OpenTelemetry as a unifying standard for telemetry. OpenTelemetry provides us with the tools to generate and collect high-quality data (metrics, logs and traces) in an agnostic and consistent format. But once these valuable signals have been collected, where are they stored, queried and, most importantly, displayed in a meaningful way? This is where the LGTM stack comes into play, a powerful combination of open source tools that form a complete and integrated observability solution, developed and primarily supported by Grafana Labs.

Tuesday, July 29, 2025 | 9 minutes Read

Introduction to performance analysis: from theory to practice

Introduction What, how, why Performance testing is a fundamental activity in the software development lifecycle, but often underestimated or performed suboptimally. In this guide, we will explore the theoretical and practical foundations needed to approach performance analysis effectively, starting from definition and objectives, up to the most effective measurement methods. Definition Performance testing is the process aimed at determining the responsiveness, throughput, reliability and scalability of a system under a given workload. It’s important to note that the “system” refers to the interaction of different components, and not to a single isolated part. Sometimes, a performance issue could simply be resolved by moving the problematic block to another subsystem.

Saturday, July 26, 2025 | 10 minutes Read