# Validate (/docs/metrics/validate)



The [`/metric-validate`](#) skill is the deep-audit counterpart to
[`metrics-discovery`](/docs/metrics/discovery&#x29;. Where discovery asks
"should this metric exist?", validate asks &#x2A;*"does this metric's
current number deserve to be trusted, and is the metric measuring
something the team can act on?"**

Use it whenever:

* a number on the cockpit feels suspiciously good or bad,
* you're about to verify a metric and need an independent
  cross-check,
* an alert fired and you need to confirm whether the red is real
  before paging an owner,
* the underlying source contract changed (new dlt schema, renamed
  column) and the metric needs a fresh smell test.

The skill operates on **one `metric_id` at a time** and writes a
dated audit file under `docs/metric-audits/<date>-<metric_id>.md`.

## What the skill checks [#what-the-skill-checks]

Five layers, in order:

### 1. Methodology [#1-methodology]

Read the metric's `methodology`, `how_to_read`, `sources`, `period`,
and `value_sql` out of the registry. Does the SQL actually express
what the methodology comment claims? Does the methodology match the
JTBD that landed it? Discrepancies between "what the comment says"
and "what the SQL does" are the most common source of distrusted
numbers.

### 2. Live source freshness [#2-live-source-freshness]

Check `<schema>._dlt_loads.inserted_at` for the source schema. A
fresh `computed_at` on the cockpit can paper over a stale
`source_as_of` — the staleness signal exists for a reason, and
validate cross-checks it directly against the dlt loads table rather
than trusting the cached `source_as_of` on the history row.

### 3. Independent cross-check rebuild [#3-independent-cross-check-rebuild]

Re-derive the same number in a *different* way — typically:

* a hand count against the team's source-of-truth tables (webbank
  `core.client_profile`, `client_profile_state_log`, etc.),
* a query that reaches the same answer through a different join
  path,
* a Jasper-rendered report covering the same concept.

If the rebuild disagrees with `metric_history`, the audit file
documents the gap and points at the SQL line responsible. If the
rebuild agrees, the audit file documents the match and the
methodology earns the right to claim "verified by cross-check".

### 4. Business semantics of red [#4-business-semantics-of-red]

The most important layer, and the one that distinguishes validate
from "the SQL runs":

* What does **red** on this metric actually mean in business terms?
  ("4 minutes of payment processing time" — meaningful? "22%
  approval rate" — wrong-low or right-low?)
* Is the *direction* right? An amber on a "passing rate" metric
  should mean "fewer pass than we want", not "more". Sign mistakes
  in `norm_value` / `alert_value` look like correct numbers until
  somebody asks why nothing ever fires.
* Does the period match the action horizon? A `30d` window on a
  metric whose owner can only act on a `24h` signal is a misdesign.

### 5. Action surface [#5-action-surface]

If the metric went red right now, what would the owner do? If the
answer is "open a ticket", the metric is closer to a vanity counter
than to a cockpit tile. The audit file calls this out — not as a
reason to delete the metric, but as a reason to revisit the JTBD or
re-route the routing.

## Outputs [#outputs]

The skill writes one file:

```
docs/metric-audits/<YYYY-MM-DD>-<metric_id>.md
```

Format:

* **Verdict.** Pass / pass-with-caveat / fail.
* **What the metric claims to measure.**
* **What it actually measures** (after reading the SQL).
* **Live freshness.** `source_as_of`, `computed_at`, dlt-loads
  cross-check.
* **Cross-check rebuild.** SQL or hand-count, plus the number.
* **Red zone semantics.** What red means in EUR / hours / clients.
* **Action.** What the owner does on red.
* **Recommendation.** Verify / fix SQL / fix thresholds / retire.

The file is committed to the repo as a permanent record. Even if the
metric is later retired, its audit trail stays.

## When validate is the right hammer [#when-validate-is-the-right-hammer]

Validate is for **one metric at a time**, deeply. It is *not* the
right tool for:

* Daily triage of the cockpit — that's
  [`heartbeat-today`](#).
* Deciding whether to add a new metric — that's
  [`metrics-discovery`](/docs/metrics/discovery).
* Routine threshold tuning — that's the inline editor on the cockpit.
* Bulk audit of every metric — open a candidates file and run
  validate on each one in sequence; there is no batch mode.

## See also [#see-also]

* [Metric lifecycle › stage 8](/docs/metrics/lifecycle#8-owner-review)
  — validate is the mechanical version of the owner review step.
* [Sources › webbank first](/docs/sources#source-priority-—-webbank-first)
  — the rule the cross-check rebuild defers to.
