The Knowledge-Driven Workflow

From conversation to computation in four steps.
How domain expertise enters the analytical engine.

Data tells you what is happening.
People tell you why.

Probes detect anomalies. Hypotheses frame questions. Diagnoses propose root causes.
But the insight that “if a material has purchases, the flag is wrong, not the date”
— that comes from a person, not a query.

Four steps, one cycle

1. Capture
SMEbit YAML
2. Analyze
consequences
3. Implement
instruments
4. Rebuild
& review

Each cycle adds knowledge to the system. The SMEbit is the handshake format
between human expertise and the analytical engine.

Capture

Domain knowledge enters the system as an SMEbit —
attributed, scoped, and tri-lingual.

Who tells us why?

👩‍⚗️
Material Managers
Business rules, catalog decisions, classification logic
🩺
Clinical Staff
Process workarounds, tracking gaps, manual steps
💻
IT Leads
System behaviors, interface quirks, midnight splits
💰
Billing Specialists
Pricing logic, reimbursement rules, invoice patterns

Structured knowledge, not free text

smebit_id: smebit_zeta_validity_activity_classification provider: name: Ronnie role: Material Management date: 2026-03-03 category: business_rule content: en: | When is_active contradicts validity dates, purchasing activity tells us which field is wrong. why: en: | The flag and dates are maintained by different processes. Activity is the ground truth for which reflects reality. anchors: - probe_id: probe_active_date_expired - probe_id: probe_inactive_date_valid

Consequence Analysis

The critical step. Once knowledge is captured,
we ask: what does this change?

Five questions for every SMEbit

Question Outcome
Does this sharpen an existing probe? Add enrichment joins, new evidence fields
Does this warrant a new probe? Write new YAML or hand-written SQL
Does this confirm or refine a hypothesis? Update evidence chain, adjust thresholds
Does this explain a confirmed hypothesis? Create or refine a diagnosis
Does this belong with other SMEbits? Create or extend a BitBundle

Ronnie’s activity classification rule

“When is_active contradicts the validity dates,
check if the material has purchases.”

Probe Has Purchases Likely Wrong Field
Active + date expired Yes Date — article genuinely in use
Active + date expired No Flag — article genuinely expired
Inactive + date valid Yes Flag — article genuinely in use
Inactive + date valid No Date — article genuinely inactive

One insight, three changes

Before

  • entity_filter — Material-only, no fact joins
  • Severity: all medium
  • Evidence: material fields only
  • No indication of which field is wrong

After

  • hand_written — joins billing + usage
  • Severity: high when real spend > CHF 1K
  • Evidence: + has_purchases, likely_wrong_field
  • Sub-classification per Ronnie’s rule

Implement

Edit or create instruments. Validate. Compile.
The engine turns YAML into SQL.

From YAML to findings

SMEbit YAML + Probe YAML/SQL
human-authored artifacts
smebitcheck + probecheck
validate against contracts
proberegistry + probecompile + smebitcompile
generate dbt SQL
rebuild.sh --skip-gen
5 tenants, 175 tests each, ~3 minutes

Rebuild & Review

Run the numbers. Does reality match
the expert’s expectation?

Enrichment, not filtering

1,368
active + date expired
findings (unchanged)
1,413
inactive + date valid
findings (unchanged)
81
escalated to high
real spend > CHF 1K

Same population, richer evidence. Every finding now says
which field is wrong and how much money is involved.

hospital_zeta: activity tells the story

Probe Likely Wrong Has Purchases Findings CHF at Risk
Active + expired date yes 610 18,984
Active + expired flag no 758 92,706
Inactive + valid date no 1,197 109,345
Inactive + valid flag yes 216 14,860

826 materials with purchases — activity confirms they’re real.
1,955 without — catalog entries that could safely be cleaned up.

Knowledge Accumulates

Individual SMEbits are atoms.
BitBundles are the stories they tell together.

From atoms to narrative

SMEbit Sentinel date convention (9999-12-31)
SMEbit Activity-based validity classification
SMEbit Material 638876A wrong description
BitBundle: Catalog Data Quality at SZO

Three separate conversations, three separate experts, one coherent story:
“The material master data at SZO has systemic quality issues.”

Why this workflow works

Principle How It Manifests
P10 Human-in-the-loop The SME is the source, not the system
P11 AI as SME AI formalizes, connects, compiles — doesn’t invent knowledge
P12 AI as colleague Consequence analysis is a conversation, not an instruction
P13 No AI in data path SMEbit content is human-authored; checks are SQL, not LLM
P16 Pragmatic generalization Wait for the second probe before abstracting a new DSL type

The difference between data and knowledge
is the why.

Capture → Analyze → Implement → Review

Every cycle makes the engine smarter —
not by adding AI, but by adding human insight.