The Knowledge-Driven Workflow
From conversation to computation in four steps.
How domain expertise enters the analytical engine.
People tell you why.
Probes detect anomalies. Hypotheses frame questions. Diagnoses propose root causes.
But the insight that “if a material has purchases, the flag is wrong, not the date”
— that comes from a person, not a query.
Four steps, one cycle
SMEbit YAML
consequences
instruments
& review
Each cycle adds knowledge to the system. The SMEbit is the handshake format
between human expertise and the analytical engine.
Capture
Domain knowledge enters the system as an SMEbit —
attributed, scoped, and tri-lingual.
Who tells us why?
Structured knowledge, not free text
Consequence Analysis
The critical step. Once knowledge is captured,
we ask: what does this change?
Five questions for every SMEbit
| Question | Outcome |
|---|---|
| Does this sharpen an existing probe? | Add enrichment joins, new evidence fields |
| Does this warrant a new probe? | Write new YAML or hand-written SQL |
| Does this confirm or refine a hypothesis? | Update evidence chain, adjust thresholds |
| Does this explain a confirmed hypothesis? | Create or refine a diagnosis |
| Does this belong with other SMEbits? | Create or extend a BitBundle |
Ronnie’s activity classification rule
“When is_active contradicts the validity dates,
check if the material has purchases.”
| Probe | Has Purchases | Likely Wrong Field |
|---|---|---|
| Active + date expired | Yes | Date — article genuinely in use |
| Active + date expired | No | Flag — article genuinely expired |
| Inactive + date valid | Yes | Flag — article genuinely in use |
| Inactive + date valid | No | Date — article genuinely inactive |
One insight, three changes
Before
entity_filter— Material-only, no fact joins- Severity: all medium
- Evidence: material fields only
- No indication of which field is wrong
After
hand_written— joins billing + usage- Severity: high when real spend > CHF 1K
- Evidence: + has_purchases, likely_wrong_field
- Sub-classification per Ronnie’s rule
Implement
Edit or create instruments. Validate. Compile.
The engine turns YAML into SQL.
From YAML to findings
human-authored artifacts
validate against contracts
generate dbt SQL
5 tenants, 175 tests each, ~3 minutes
Rebuild & Review
Run the numbers. Does reality match
the expert’s expectation?
Enrichment, not filtering
findings (unchanged)
findings (unchanged)
real spend > CHF 1K
Same population, richer evidence. Every finding now says
which field is wrong and how much money is involved.
hospital_zeta: activity tells the story
| Probe | Likely Wrong | Has Purchases | Findings | CHF at Risk |
|---|---|---|---|---|
| Active + expired | date | yes | 610 | 18,984 |
| Active + expired | flag | no | 758 | 92,706 |
| Inactive + valid | date | no | 1,197 | 109,345 |
| Inactive + valid | flag | yes | 216 | 14,860 |
826 materials with purchases — activity confirms they’re real.
1,955 without — catalog entries that could safely be cleaned up.
Knowledge Accumulates
Individual SMEbits are atoms.
BitBundles are the stories they tell together.
From atoms to narrative
Three separate conversations, three separate experts, one coherent story:
“The material master data at SZO has systemic quality issues.”
Why this workflow works
| Principle | How It Manifests |
|---|---|
| P10 Human-in-the-loop | The SME is the source, not the system |
| P11 AI as SME | AI formalizes, connects, compiles — doesn’t invent knowledge |
| P12 AI as colleague | Consequence analysis is a conversation, not an instruction |
| P13 No AI in data path | SMEbit content is human-authored; checks are SQL, not LLM |
| P16 Pragmatic generalization | Wait for the second probe before abstracting a new DSL type |
The difference between data and knowledge
is the why.
Capture → Analyze → Implement → Review
Every cycle makes the engine smarter —
not by adding AI, but by adding human insight.