nuMetrix for
Data Engineers

Inside the dbt project: medallion layers, dispatch macros,
YAML-compiled probes, and multi-tenant DuckDB

March 2026

What You’ll See Today

  • Act I — The four-layer pipeline: Bronze → Silver → Gold → Platform
  • Act II — Source-system dispatch and schema-per-tenant isolation
  • Act III — The probe compiler: YAML DSL → dbt SQL → findings tables
  • Act IV — Orchestration, rebuild scripts, and quality testing
  • Act V — Design decisions and the road ahead
This is the “how” deck. Real code snippets. Real YAML. Real SQL. If you want the “what and why”, see An Introduction to nuMetrix.

122 Models, One dbt Project

14
Bronze tables
14
Silver tables
22
Gold views + tables
29
Platform views
Probe tables
9
Metrics + Debug + Lineage
Full rebuild (5 tenants, 68K–150K cases each): ~2 minutes. Single-threaded DuckDB. No Spark. No warehouse.

The Pipeline

Four layers. Strict responsibilities. No leaking.

Bronze / Silver / Gold / Platform

CSV
Bronze
structure
Silver
validation
Gold
contract
Platform
union
  • Bronze — Column mapping only. One macro call per entity. Materialized as TABLE.
  • Silver — Surrogate keys, type casting, is_valid + invalid_reason. TABLE.
  • GoldWHERE is_valid = true. Versioned contract. VIEW.
  • PlatformUNION ALL across tenants with tenant_id. VIEW.

Bronze: One Macro Call

models/bronze/bronze_cases.sql {{ config(materialized='table') }} {{ bronze_cases_query() }}
  • Every Bronze model is a two-line file. All logic lives in the dispatch macro.
  • The macro reads CSVs with DuckDB’s read_csv() — no COPY, no staging tables.
  • Adds source_file and row_number for full lineage traceability.

Silver: Validate Everything

models/silver/silver_cases.sql (abridged) with source as ( select * from {{ ref('bronze_cases') }} ), validated as ( select {{ dbt_utils.generate_surrogate_key(['source_file','row_number']) }} as case_key, trim(case_token) as case_token, cast(start_time as timestamp) as start_time, trim(lower(case_type)) as case_type, -- ... trimming, casting, nullif ... case when trim(case_token) is null then false when trim(case_type) not in (select code from {{ ref('enum_case_type') }}) then false else true end as is_valid, -- mirror logic for invalid_reason from source )
Every Silver model follows the same pattern: surrogate key → trim/cast → is_valid → invalid_reason. Enum validation against dbt seeds. No silent drops.

Gold: The Product Contract

models/gold/gold_cases.sql select case_key, case_token, start_time, end_time, cost_center_id, case_type, drg_code, drg_description from {{ ref('silver_cases') }} where is_valid = true
  • Gold is the API surface. Every downstream consumer (probes, Explorer, Evidence) references Gold only.
  • Explicit column list — no SELECT *. If a column isn’t in Gold, it doesn’t exist downstream.
  • Layer responsibilities: Bronze=structure, Silver=domain truth, Gold=consumption.

Data Contracts as JSON

contracts/gold_contract.v1.json (abridged) { "version": "gold.v1", "entities": { "Case": { "dbt_model": "gold_cases", "primary_key": "case_key" }, "Material": { "dbt_model": "gold_materials", "primary_key": "material_key" }, "BillingEvent": { "dbt_model": "gold_billing_events", "primary_key": "billing_key" }, "CaseMaterialUsage": { "dbt_model": "gold_case_material_usage", ... }, // ... 20 entities total } }
  • Probes reference Case, not gold_cases. The compiler resolves contract names to dbt models.
  • 20 entities form the Gold contract: 7 core + 3 taxonomy + 10 extended (suppliers, classifications, etc.).
  • Contract is versioned. Breaking changes require a new version.

Quality Is Queryable

models/metrics/silver_quality_metrics.sql (abridged) with entity_counts as ( select 'cases' as entity, count(*) as total_rows, count(*) filter (where not is_valid) as invalid_rows, count(*) filter (where is_valid) as valid_rows from {{ ref('silver_cases') }} union all select 'billing_events', count(*), ... from {{ ref('silver_billing_events') }} -- ... 9 entities total ) select entity, total_rows, invalid_rows, valid_rows, round(100.0 * invalid_rows / nullif(total_rows, 0), 2) as invalid_pct from entity_counts
Data quality is a dbt model, not a dashboard annotation. Silver probes can query what Gold filtered out. Every invalid row has a reason.

Dispatch & Multi-Tenancy

Three ERPs, five hospitals, one codebase

Three ERPs, One Schema

OPALE
Swiss hospital ERP
cases.csvcase_token
SAP MM
German standard
aufk.csvAUFNR
Navision
Microsoft Dynamics NAV
service_order.csvNo_
  • Each source system has its own column names, file names, and enum conventions.
  • Bronze dispatch macros translate once at the boundary. Silver and Gold never know which ERP the data came from.

Source-System Dispatch

macros/source_system_columns.sql (cases, abridged) {% macro bronze_cases_query() %} {% set source_system = resolve_source_system() %} {% set path = '../../tenants/' ~ tenant ~ '/' ~ source_system ~ '/csv/' %} {% if source_system == 'opale' %} select case_token, start_time, cost_center_id, case_type, ... from read_csv('{{ path }}cases.csv', header=true, all_varchar=true) {% elif source_system == 'sap' %} select AUFNR as case_token, ERDAT as start_time, KOSTL as cost_center_id, case when AUART = 'ZOP1' then 'inpatient' ... end as case_type from read_csv('{{ path }}aufk.csv', header=true) {% elif source_system == 'navision' %} select "No_" as case_token, Starting_Date_Time as start_time, ... from read_csv('{{ path }}service_order.csv', header=true) {% endif %} {% endmacro %}
One macro per entity × one if/elif branch per source system. Adding a 4th ERP = adding an elif block. Zero changes to Silver/Gold.

Schema-per-Tenant Isolation

dbt_project.yml (model routing) vars: tenant_id: "hospital_alpha" on-run-start: - "CREATE SCHEMA IF NOT EXISTS {{ var('tenant_id') }}" - "CREATE SCHEMA IF NOT EXISTS platform" - "CREATE SCHEMA IF NOT EXISTS reference" models: numetrix: bronze: +schema: "{{ var('tenant_id') }}" silver: +schema: "{{ var('tenant_id') }}" gold: +schema: "{{ var('tenant_id') }}" platform: +schema: "platform"
generate_schema_name override routes to {tenant_id}.*. Each tenant is a DuckDB schema — fully resettable, independently rebuildable.

Five Tenants

Tenant Source Cases Billing Profile
hospital_alpha OPALE 2K ~7K Swiss mid-size, balanced
hospital_beta SAP 1.5K ~6K German, cardiac-heavy
hospital_gamma Navision 1.8K ~6K French-Swiss, general surgery
hospital_delta OPALE 150K ~1.68M Large university, advanced defects
hospital_zeta OPALE 68K ~2.4M Real data — SZO (Brig/Visp)
4 synthetic tenants (deterministic, seeded PRNG) + 1 real hospital. Same pipeline, same probes, different source systems.

Platform: UNION ALL with Jinja

models/platform/platform_gold_cases.sql {{ config(materialized='view') }} -- depends_on: {{ ref('gold_cases') }} {% set tenants = var('tenants', [var('tenant_id', 'hospital_alpha')]) %} {% for t in tenants %} select '{{ t }}' as tenant_id, * from {{ t }}.gold_cases {% if not loop.last %} union all {% endif %} {% endfor %}
  • 29 platform views unioning Gold across all tenants. Evidence and cross-hospital dashboards query this layer.
  • The depends_on comment ensures dbt builds Gold before Platform even though there’s no direct ref().
  • Tenant list passed via --vars: '{"tenants": ["alpha", "beta", ...]}'

Reference Data as dbt Seeds

enum_case_type
inpatient, outpatient, day-case
enum_billing_status
POSTED, PENDING, REVERSED
enum_movement_type
Stock transfer types
enum_material_group
Material categories
enum_procedure_type
Clinical procedure types
enum_cost_center_type
OU classification
drg_lookup
DRG code reference
mandatory_implants
Procedure → implant rules
All seeds route to reference.* schema. Silver validates enums against these tables. Probes use mandatory_implants for compliance checks.

The Probe Compiler

YAML in, dbt SQL out, findings everywhere

From YAML to Findings

probes/*.yaml
probecompile.py
probe_findings__*.sql
dbt build
probes/*.yaml
proberegistry.py
probe_registry.sql
  • probecompile.py — Validates YAML against contracts, resolves entity names, generates SQL.
  • proberegistry.py — Generates tri-lingual metadata (EN/DE/FR) for the Explorer UI.
  • probecheck.py — Contract validation only (CI gate). No SQL output.

Anatomy of a Probe

probes/probe_revenue_leakage.yaml probe_id: probe_revenue_leakage version: "1.0.0" contract: "gold.v1" type: balance severity: high scope: entity_type: Case group_by: [case_token] time: entity: CaseMaterialUsage field: usage_timestamp bucket: week left: entity: CaseMaterialUsage expression: "quantity * unit_price" aggregate: sum right: entity: BillingEvent expression: "billed_amount" aggregate: sum where: billing_status: POSTED tolerance_pct: 2.0 money_at_risk: "left_value - right_value"

What the Compiler Produces

models/probes/probe_findings__probe_revenue_leakage.sql (abridged) -- Probe: probe_revenue_leakage (balance) {{ config(materialized='table', tags=['probe']) }} with left_agg as ( select case_token, sum(quantity * unit_price) as left_value from {{ ref('gold_case_material_usage') }} group by case_token ), right_agg as ( select case_token, sum(billed_amount) as right_value from {{ ref('gold_billing_events') }} where billing_status = 'POSTED' group by case_token ), compared as ( -- tolerance filter, pct_diff calc ) select md5(...) as finding_id, -- deterministic '{{ var("tenant_id") }}' as tenant_id, 'probe_revenue_leakage' as probe_id, severity, entity_type, entity_id, time_bucket, money_at_risk, evidence -- JSON blob from compared
Every probe emits the same findings contract: finding_id, tenant_id, probe_id, severity, entity_type, entity_id, time_bucket, money_at_risk, evidence.

12 Probe Types

Core
  • balance — Compare two aggregates (usage vs billing)
  • duplicate — Find duplicate records by field combo
  • ratio — Numerator/denominator against expected
  • mandatory_item — Required items per entity
  • distribution_outlier — Z-score anomalies
  • temporal_sequence — Event ordering validation
Extended
  • trend — Rolling metric regression
  • silver_audit — Silver-layer DQ checks
  • assessment — Aggregate across probes
  • invoice_integrity — Line-item reconciliation
  • reimbursement_gap — Tariff vs actual
  • cross_reference — Cross-entity validation
29 YAML files total: 24 probes + 5 assessments. Each type has a compiler template. Adding a new type = adding a Python template.

Assessments, Hypotheses, Diagnoses

Diagnoses
“Why is it happening?” — 8 root-cause rules
Hypotheses
“Is this a problem?” — 9 business questions
Assessments
Aggregate findings per entity — 5 health scores
Probes
Row-level findings — 24 diagnostic queries
Gold Contract
20 validated entities across all tenants
Each layer has its own YAML DSL, compiler, contract, and dbt model output. All compile to the same findings schema. dbt resolves the dependency graph automatically.

Operations & Quality

Rebuild, orchestrate, calibrate, test

Dagster Asset Graph

tenant_dbt_build
platform_gold_build
evidence_site
orchestration/src/numetrix_dagster/assets/dbt.py (abridged) @asset(group_name="dbt") def tenant_dbt_build(context): """Build all dbt models for every tenant sequentially.""" for tenant_id, cfg in TENANT_CONFIG.items(): dbt_vars = json.dumps({"tenant_id": tenant_id}) result = subprocess.run( [DBT_EXECUTABLE, "build", "--vars", dbt_vars], cwd=DBT_PROJECT_DIR, capture_output=True ) if result.returncode != 0: raise RuntimeError(f"dbt build failed for {tenant_id}")
Sequential tenant builds avoid DuckDB file-lock contention. Each tenant is a fresh subprocess.run(). Tenant config loaded from tenants/*/tenant.yml.

The Rebuild Script

scripts/rebuild.sh #!/bin/bash # rebuild.sh — Regenerate data, build all tenants, rebuild platform. # Usage: # ./scripts/rebuild.sh # all tenants # ./scripts/rebuild.sh alpha gamma # specific tenants only # ./scripts/rebuild.sh --skip-gen # skip CSV generation set -eu # Per tenant: generate CSVs → dbt build (3 phases) → next tenant # Then: platform layer → audit reports → calibration
5
Tenants
3
Build phases per tenant
~2 min
Full rebuild
External services
Three phases per tenant: (1) core models (bronze → silver → gold → metrics), (2) probes + hypotheses + diagnoses, (3) lineage. Then platform. All deterministic.

Synthetic Data with Known Defects

Standard defects (all tenants)
Revenue leakage, missing implants, cost center mismatches, stale pricing.

Every tenant has baseline defects at controlled rates.
Advanced defects (hospital_delta)
5K orphan billing rows, 2% duplicates, 1% phantom usage, 4% cross-CC billing, 8% stale pricing, 3% timing anomalies.

Probe calibration targets.
generate_synthetic_data.py — seeded PRNG per tenant. Same seed = same output. Known defect rates let us verify probe recall and precision.

Testing: is_valid vs dbt test

is_valid (Silver layer)
Row-level flags. Every row assessed individually.
Invalid rows survive into Silver for auditability.

is_valid = false + invalid_reason = 'missing case_token'
dbt test (CI gate)
Schema-level assertions. Build fails if violated.
14 dbt tests: unique keys, not-null on Gold, accepted values.

Guards the contract, not the data.
Two complementary systems: is_valid handles data quality (expected to fail sometimes). dbt tests guard pipeline correctness (should never fail).

Design Decisions & Road Ahead

Why these tools, why this shape

Why DuckDB, Why YAML Probes

Why DuckDB
In-process. No server, no auth, no network. One file: dev.duckdb.

CSV-native. read_csv() directly in SQL. No staging, no COPY.

Fast enough. 150K cases + 1.68M billing in seconds. Full rebuild < 2 min.

Portable. Copy the .duckdb file to any laptop. Query with any tool.
Why YAML Probes
Declarative. Business logic in YAML, plumbing in the compiler.

Validated. Contract checks before any SQL is generated.

Uniform output. Every probe emits the same findings schema. No bespoke SQL.

Tri-lingual. Registry generates EN/DE/FR metadata from YAML + glossary.

What’s Built, What’s Next

Built & Working
  • 122 dbt models across 4 layers
  • 3 source systems (OPALE, SAP, Navision)
  • 29 probes + 9 hypotheses + 8 diagnoses
  • Schema-per-tenant isolation
  • Deterministic synthetic data generation
  • Explorer SvelteKit app (DE/FR locales)
  • Evidence.dev static BI site
  • Dagster orchestration
Road Ahead
  • Additional source systems (HL7 FHIR, SAP S/4)
  • Incremental builds (append-only Bronze)
  • Probe precision/recall reporting
  • Multi-period trend analysis
  • Cloud deployment (object storage + DuckDB)

One Pipeline,
Every Hospital

122 models. 5 tenants. 3 source systems. 29 probes.
One dbt build. One DuckDB file. Zero infrastructure.

nuMetrix