Method · 12 weeks · 4 phases

Diagnose. Define. Instrument. Operate.

A repeatable engineering protocol — not a deck. We treat observability like the production system it actually is: governed, versioned, and tested under load before handover.

01 Operating principles

Four constraints we apply to every engagement.

Most observability disasters trace back to the same handful of architectural shortcuts. We refuse them.

filter_alt

Signal before tooling.

Every probe must answer a question someone is paid to ask. We delete dashboards more often than we add them.

schema

Topology before metrics.

If you don't know your service graph, no SLO will save you. We map the boundary before we instrument the box.

savings

Cardinality is a budget.

Six-figure observability bills are a design failure, not a vendor problem. We treat $/series as a first-class metric.

engineering

Operate, don't just deploy.

The handover only counts if your team can run the system through a real P1 — so we run two before we leave.

02 Protocol

The four phases of an engagement.

Linear, fixed-fee, with named deliverables at each gate. You can fire us at the end of any phase.

image

Asset pending

12-week engagement timeline showing the four phases — Diagnose, Define, Instrument, Operate — across weeks 1 to 12, with deliverable milestones and exit gates marked at week 4 and week 8.

Horizontal Gantt schematic on paper-white #f7f9fb. Four phase bars stacked across a 12-week timeline: 'DIAGNOSE' (wk 1–2), 'DEFINE' (wk 3–4), 'INSTRUMENT' (wk 5–8), 'OPERATE' (wk 9–12). Phase bars filled obsidian #191c1e with phase name in JetBrains Mono caps inside, white text. Above: tick marks at each week with mono 'WK 01'–'WK 12' labels in #727687. Below each phase: 3–4 deliverable bullets ('MATURITY PROFILE', 'OTEL CONFIG', 'RUNBOOKS') as small obsidian text with electric-blue #0066FF square markers. Two vertical 1px electric-blue lines at week 4 and week 8 with 'GATE · EXIT POINT' labels. 16:9, generous whitespace, 1px borders, no fills other than the obsidian phase bars and the electric-blue accents.

/img/approach/engagement-timeline.png

Week 1

Diagnose

We baseline your telemetry hierarchy, ingest pipeline, alert load, and SLO discipline. No tools deployed yet — we're listening for the structural problem under the symptoms.

description Maturity profile

description Cardinality audit

description Alert noise heatmap

description Stakeholder interview log

Week 2–3

Define

We rewrite the data hierarchy to mirror your business logic. Services map to user journeys; metrics map to outcomes. SLOs are defined against revenue-bearing flows, not infra vanity.

description Telemetry schema (v1)

description SLO catalogue

description Trace coverage plan

description Cost target sheet

Week 4–8

Instrument

We deploy OpenTelemetry where it scales, eBPF where it matters, and turn off what's not earning its keep. Your engineers pair on every change — there is no black-box rollout.

description OTel collector config

description eBPF probe set

description Dashboard inventory

description On-call runbooks

Week 9–12

Operate

We run two real incidents with your team. We tune. We hand over the keys. Your team owns the system — we stay on retainer for the questions that come up at 3am.

description Incident retros (×2)

description 90-day roadmap

description Capacity review

description Handover doc set

03 Boundaries

What we won't do.

The cleanest way to be useful is to be honest about scope. These are the things you should hire someone else for.

block Long-term staff augmentation — we leave after handover.
block Vendor-led implementations where the tool is pre-decided.
block Compliance theatre — checkbox observability for audit only.
block Greenfield platform builds with no production traffic to learn from.

Engagement.start()

Want to see the protocol on your stack?

A 30-minute call. We'll tell you whether you need us — half the time the answer is no.

Book a diagnostic Compare engagements