Skip to content


Prometheus intro

In Prometheus — Titan in Systems and Services Monitoring Vevek Pandian provides a nice high-level intro.

Service mesh o11y

Observing gRPC-based Microservices on Amazon EKS running Istio by my colleague Gary A. Stafford is a really well done deep dive into the topic.

The human factor

An interesting podcast and article around the Human Factor in observability via TNS.

Otel incubates

The OpenTelemetry projected reached incubation status at CNCF and if you want to keep track about how SDKs, APIs, and protocols are doing across the different signal types and programming languages, bookmark the new project status page.

And if you want to learn Otel, check out An Introduction to OpenTelemetry or peruse the full (free) course called How to Use OpenTelemetry to Understand Software Performance by Beau Carnes.

New Chaos activity

The CNCF has kicked off a new activity around Chaos Engineering. Check out and flag your interest in the Cloud Native Chaos Engineering WG - Charter.


Nilesh Jayanandana put together a very useful list around PostgreSQL Monitoring: The Best Tools and Key Metrics to Help Improve Database Performance.

Grafana at scale

An interesting article on the Grafana blog: How Salesforce manages service health at scale with Grafana and Prometheus.

AppScope updates

Latest AppScope Updates the version 0.7 adds ability to attach to a running process, TLS support, and Alpine Linux support.

Open Distro to OpenSearch

Sarat Vemulapalli and Andrew Hopp show how to upgrade from Open Distro to OpenSearch.

Systems Observability

Systems Observability is an interesting contribution, motivating and introducing o11y.

That was it for this edition of the o11y newsletter. Feel free to share news items with me via Twitter—DMs are open if you like to share something in private. Stay safe and hope to see you around next week!