In Prometheus — Titan in Systems and Services Monitoring Vevek Pandian provides a nice high-level intro.
Service mesh o11y¶
Observing gRPC-based Microservices on Amazon EKS running Istio by my colleague Gary A. Stafford is a really well done deep dive into the topic.
The human factor¶
An interesting podcast and article around the Human Factor in observability via TNS.
The OpenTelemetry projected reached incubation status at CNCF and if you want to keep track about how SDKs, APIs, and protocols are doing across the different signal types and programming languages, bookmark the new project status page.
And if you want to learn Otel, check out An Introduction to OpenTelemetry or peruse the full (free) course called How to Use OpenTelemetry to Understand Software Performance by Beau Carnes.
New Chaos activity¶
The CNCF has kicked off a new activity around Chaos Engineering. Check out and flag your interest in the Cloud Native Chaos Engineering WG - Charter.
Nilesh Jayanandana put together a very useful list around PostgreSQL Monitoring: The Best Tools and Key Metrics to Help Improve Database Performance.
Grafana at scale¶
An interesting article on the Grafana blog: How Salesforce manages service health at scale with Grafana and Prometheus.
Latest AppScope Updates the version 0.7 adds ability to attach to a running process, TLS support, and Alpine Linux support.
Open Distro to OpenSearch¶
Sarat Vemulapalli and Andrew Hopp show how to upgrade from Open Distro to OpenSearch.
Systems Observability is an interesting contribution, motivating and introducing o11y.
That was it for this edition of the o11y newsletter. Feel free to share news items with me via Twitter—DMs are open if you like to share something in private. Stay safe and hope to see you around next week!