Skip to content


Prometheus security

Don’t let Prometheus Steal your Fire by Andrey Polkovnychenko and Shachar Menash, highlighting relevant considerations and good practices for operating Prometheus securely.

Agentless monitoring

Austin Parker explains why the future of monitoring is agentless. Interesting read, check it out.

SRECon 21 talk on o11y

Take me Down to the Paradise City Where the Metric is Green and Traces are Pretty by Ricardo Ferreira. It's not just the title that rocks, here ;)

Managed Grafana cross-account

My colleagues Elamaran Shanmugam and Munish Dabra providing a deep dive on how to set up Amazon Managed Grafana cross-account data source using customer managed IAM roles.

Avoid lock-in with Otel

Vera Reynolds shows Vendor Switching With OpenTelemetry (OTel). Great argument for telemetry increasingly becoming table stakes.

Kubernetes troubleshooting

Kubernetes workload troubleshooting with metrics, logs, and traces is a nice article by Andreas Grabner showing Dynatrace in action.

Saiyam Pathak on observability trends 2021. Thanks for sharing!

Pressure Stall Information

Julien Pivotto tweets and we take note:

Load average is a confusing metric that is misleading and can have many causes. If you want to monitor resources bottlenecks on your systems, there are nowadays better tools than it in the kernel: Pressure Stall Information (PSI).

Note that PSI is available via the Prometheus node_exporter.

Downsampling metrics

In A different and (often) better way to downsample your Prometheus metrics Matvey Arye and Ante Krešić walk us through downsampling Prometheus metrics with Promscale.

That was it for this edition of the o11y newsletter. Feel free to share news items with me via Twitter—DMs are open if you like to share something in private. Stay safe and hope to see you around next week!