MetricsMetrics

Metrics

How to think about metrics on Heartbeat — the contract, the file shape, the lifecycle.

A Heartbeat metric is one number per question that matters, computed on a schedule, surfaced as a tile on the cockpit, and judged against a contract every consumer surface speaks (cockpit, alerts, MCP). This section is the entry point for metric work.

The two artefacts

Each metric is one file today:

  • db/metrics/<metric_id>.sql — config (name, period, value_sql, how_to_read, methodology, sources, provisional norm_value / alert_value).

There is no per-metric Python module. Compute is centralised in api/metrics_compute.py, which executes the metric's value_sql against marts-db on every refresh tick. Per-metric Python is reserved for the day a metric needs source-API access or logic that cannot be expressed as value_sql — none exist today. See architecture for the pipeline context.

The data contract

Every metric reads from the registry / history with the same shape — six components covering observation, period, trend benchmark, and three thresholds (norm/alert/target). The full contract and the status priority live in architecture › Metric data contract.

The thing to remember when adding or tuning a metric:

  • norm_value is amber — owner commitment.
  • alert_value is red — SLA, regulatory cap, capacity ceiling. Never auto-derived.
  • target_value is the OKR stretch — display-only, paired with target_due_date.
  • typical (μ ± σ) is informational, not a threshold.

Lifecycle — the social half of a metric

A metric's life is not "ship it and forget". It moves through ten named stages, from metrics-discovery through provisional thresholds, the unverified handover state, owner assignment, and verification. Most tiles will live in unverified indefinitely — that's the expected normal state, not a quality gate. The push to verified happens naturally the first time the number crosses red and an owner needs to act on it.

See metric lifecycle for the full stage breakdown.

Naming

Short noun phrase. Window/severity qualifiers go in period or description, not in name. The 4-char code chip on the cockpit (e.g. APRC, KYCP) is the operator's day-to-day handle — it must be unique, memorable, and stay in lockstep with name. See the methodology header in db/metrics/_schema.sql for the full convention.

Tuning thresholds

norm_value (amber) and alert_value (red) must come from outside the dashboard — owner commitment, SLA, regulatory cap. Never auto-derive from data. A threshold that drifts with the data hides the very signal the threshold exists to surface.

Operators can change norm_value / alert_value / target_value / target_due_date / benchmark_mean / benchmark_stddev / verification_state / owner from the cockpit UI without a redeploy. A DB-level trigger (heartbeat.metric_registry_guard) blocks every other write path.

What's in the rest of this section

  • Lifecycle — the 10 stages.
  • Discovery — the metrics-discovery skill, the only sanctioned way to add a metric.
  • Validate — the /metric-validate skill for deep audit of one metric.