Library · 43 resources · open access

The methodology, fully open.

Tools, guides, and field notes — everything we'd want a client to read before the first call. Nothing gated, no email required.

Filter All · 43 Guides · 06 Essays · 32 Tools · 05
Section · 02 · Guides

The standard library.

6 guides
Fundamentals 8 min read

The Golden Signals: a practical primer.

Latency, Traffic, Errors, Saturation: the four signals that characterise the health of any service. What they mean, how to measure them, and the pitfalls that catch most teams.

Read guide arrow_forward
Alerting 9 min read

Burn-rate alerting, properly.

Threshold alerts fire late. Burn-rate alerts fire when the budget is being consumed faster than the SLO allows. The two-window pattern, the math, and a working PromQL implementation.

Read guide arrow_forward
Governance 10 min read

Error-budget policy that survives the first P1.

An SLO without a policy is just a dashboard. The five budget states, the decision-owner per state, the ship-freeze mechanics that work in practice, plus a copy-paste template.

Read guide arrow_forward
Architecture 8 min read

OTel Collector vs vendor agents.

Where each fits, the lock-in cost of getting it wrong, and the migration path away from vendor-only instrumentation. The pragmatic recommendation.

Read guide arrow_forward
SLIs 7 min read

A starter SLI catalogue.

Availability, latency, throughput, saturation, quality. The indicators we deploy first on every engagement, organised by Golden Signal, with cardinality discipline and recording rules.

Read guide arrow_forward
SLO design 6 min read

Stop applying 99.95% to everything.

One-size-fits-all SLOs cause toil. The four-tier model, the SLO-vs-SLA rule that's easy to get wrong, and the tier-assignment workshop that gets sign-off.

Read guide arrow_forward
Section · 03 · Field notes & essays

What we keep finding.

32 pieces
Engagement.start()

Want this lived experience on your stack?

The library is the methodology, fully open. The Diagnostic engagement is what tells you which parts apply to you, in what order, and on what timeline.