Going for Audit

nuMetrix Audit Readiness Assessment

Where we stand today, what's missing,
and how to get to 100% auditability.

February 2026

The Question

Can an external auditor trace any finding
back to its source data, understand why it was flagged,
and verify the rules haven't changed since?

What auditors need

Complete data lineage (source to finding)
Deterministic, reproducible results
Versioned rules with change history
Temporal tracking (when did this appear?)
Exception trail (who reviewed what?)

What nuMetrix provides

Source file + row number on every record
MD5 deterministic finding IDs
Probe version strings on findings
No temporal tracking yet
No exception management yet

Overall Readiness

3.4 / 5

Strong analytical foundations.
Weak operational audit trail.

7 of 12 capabilities scored 4 or 5 — the data pipeline is solid.
5 capabilities scored 1 or 2 — the operational layer needs work.

Maturity Matrix

Capability		Evidence
Source lineage	5	source_file + row_number on all 9 Bronze models
Validation trail	5	is_valid + invalid_reason per Silver row; 34 rules documented
Validation rule registry	5	lineage_validation_rules + lineage_validation_counts queryable
Finding determinism	5	MD5(probe_id \| tenant_id \| entity_id \| time_bucket)
Evidence chain	4	Probes → Hypotheses → Diagnoses fully linked
Pipeline metrics	4	Row counts per layer; no timestamps
Audit reports (PDF)	4	4 report types × 3 languages, stored in DuckDB
Finding lifecycle	2	Rebuilt each run — no created_at or state tracking
Rule version history	2	Version string exists; no changelog
Column-level lineage	2	Implicit in macros; not queryable
Execution metadata	1	No run_id, no executed_at on findings
Exception management	1	No false-positive marking, no review trail

What's Already Solid

Data Pipeline

No silent filtering — invalid rows flagged, not dropped, until Gold
source_file + row_number on every Bronze row — full CSV traceability
Surrogate keys via dbt_utils — deterministic, reproducible
34 validation rules documented in lineage_validation_rules
Pipeline metrics — row counts at every layer, loss percentages

Diagnostic System

Deterministic finding IDs — MD5 hash, idempotent per run
Evidence chain — Probes → Hypotheses → Diagnoses linked
Weighted verdicts — primary/supporting/context/counter roles
4 contracts — Gold, Silver, Findings, Diagnosis (all versioned)
PDF audit reports — data quality, financial risk, entity health, analytics readiness

The Audit Trail Today

Every layer is traceable to the one below it. The chain is complete — but only as a snapshot.

Diagnosis

↓ gates on confirmed hypothesis

Hypothesis Verdicts

↓ aggregates evidence from probes

Probe Findings

↓ queries Gold entities

Gold (valid rows only)

↓ filters is_valid = true

Silver (all rows + is_valid + invalid_reason)

↓ validates + surrogate keys

Bronze (source_file + row_number)

↓ reads CSV

Source CSVs (tenants/{id}/{system}/csv/)

Gap Analysis

Tier 1 Must-have for audit

Execution metadata — No run_id or timestamp on findings. Can't answer "when was this finding generated?"
Finding lifecycle — Findings rebuilt each run. Can't track when a finding first appeared or was resolved.
Score transparency — Hypothesis verdict shows 0.65 but not which probes contributed what weight.

Tier 2 Should-have for external audit

Rule change history — Probe YAML changes not tracked in queryable form.
Exception management — No way to mark findings as false-positive or reviewed.
Evidence schema — Evidence JSON unstructured; varies by probe type.

Tier 3 Nice-to-have

Column-level lineage — Source-to-target column mapping not queryable (implicit in macros).
Finding → source rows — Can't trace a finding back to specific Bronze rows.
Cross-entity impact — A bad material record affects billing, usage, and cases — no unified view.

Key insight: The analytical layer (what happened, why) is strong. The operational layer (when, by whom, status changes) is almost absent.

Phase 1: Execution Context

Low effort

Add run_id + timestamp to every finding

Modify the probe compiler (probecompile.py) to inject two columns into every generated SQL model:

execution_id — dbt's invocation_id (UUID per run)
executed_at — current_timestamp at build time

Same change to hypothesiscompile.py and diagnosiscompile.py.

What this unlocks

Answer "when was this finding generated?" for any finding
Group findings by run — compare two builds side by side
Detect stale findings (last execution was 3 weeks ago)

Files: scripts/probecompile.py, scripts/hypothesiscompile.py, scripts/diagnosiscompile.py, contracts/findings_contract.v1.json

Phase 2: Score Transparency

Low effort

Show the math behind every verdict

Add an evidence_breakdown JSON column to hypothesis_verdicts:

[
  {"probe_id": "probe_revenue_leakage",  "role": "primary",    "weight": 3, "findings": 42, "signal": 1.0,  "contribution": 0.43},
  {"probe_id": "probe_orphan_billing",   "role": "supporting", "weight": 2, "findings": 18, "signal": 0.89, "contribution": 0.25},
  {"probe_id": "probe_duplicate_billing", "role": "context",   "weight": 1, "findings": 0,  "signal": 0.0,  "contribution": 0.0}
]

Same for diagnosis_verdicts: add confidence_breakdown showing base + each conditional boost.

Explorer hypothesis detail page renders the breakdown as a table.

Phase 3: Finding Lifecycle

Medium effort

Track when findings appear, persist, and resolve

New dbt incremental model: finding_snapshots

finding_id — FK to probe findings
first_seen — timestamp of first appearance
last_seen — updated on every run where finding still exists
run_count — how many consecutive runs this appeared
state — detected → acknowledged → resolved

What this unlocks

"This finding has been active for 47 days" — urgency signal
"12 findings resolved since last month" — progress tracking
Auditor can see the full lifecycle, not just current state

Requires: dbt incremental materialization (merge strategy on finding_id). New pattern for the project.

Phase 4: Exception Management

Medium effort

Let humans annotate findings

New table: finding_exceptions

finding_id — which finding
status — false_positive | investigated | accepted_risk
reason — free-text justification
created_by — who made the call
created_at — when

Explorer changes

Finding detail page: "Mark as false positive" / "Mark as investigated" buttons
API endpoint: POST /api/findings/{finding_id}/exception
New pattern: Explorer gets write access (currently read-only)

This is the bridge from "analytics tool" to "audit workflow."

Roadmap to 5/5

Phase 1

Execution context

3.4 → 3.8

Phase 2

Score transparency

3.8 → 4.2

Phase 3

Finding lifecycle

4.2 → 4.6

Phase 4

Exception management

4.6 → 5.0

Phases 1+2

Compiler-only changes

No new dbt models. No Explorer changes.

Phases 3+4

New dbt models + Explorer write

Incremental materialization. API endpoints.

Summary

Today (3.4 / 5)

Full source-to-Gold lineage
No silent filtering anywhere
Deterministic finding IDs
Evidence chain: Probes → Hypotheses → Diagnoses
PDF audit reports in 3 languages
Calibration harness with recall metrics

After 4 phases (5.0 / 5)

Every finding timestamped with run_id
Full score transparency on verdicts
Finding lifecycle: first_seen, age, state
Exception trail: who reviewed, when, why
Auditor can reconstruct any finding's full history

The data tells the truth.
Now we make the trail visible.