Terminology — The Canonical Glossary

This is the single authoritative reference for all terms used in jinflow and its domain packs. If a term is not defined here, it is not part of the shared vocabulary.

This file is generated. Edit terminology/terminology.yaml, then run python3 scripts/terminologycompile.py to regenerate.

Foundations

The Diagnostic Metaphor

The guiding metaphor is medical:

The organization is the patient
The lakehouse is the diagnostic environment
Analytics are signals
Interventions are treatments
Measurement closes the loop

The goal is shared understanding across domain experts, operations, data, IT, and management.

DDD Building Blocks

The system uses domain-driven design primitives:

Building block	Definition	Examples
Entity	A domain object with identity over time	Case, Material, CostCenter
Value Object	An immutable, identity-less concept	TimeWindow, Threshold, ConfidenceScore
Aggregate	A consistency boundary enforcing invariants	ProbeDefinition, TreatmentPlan, ImpactStudy
Domain Event	Something that happened and matters to the domain	ProbeExecuted, FindingDetected, TreatmentApplied
Projection	A read model built for a specific purpose (Gold views)	Capacity views, quality KPIs
Domain Service	Stateless operation on domain objects	Signal execution, perspective scoring

Governing Principle

Diagnostics and interventions must share the same fact foundation. Without this, treatments cannot be evaluated credibly and learning breaks down.

A. Product Terms

These are the jinflow vocabulary — what the system does and what it produces.

Analytics Pyramid

The five-layer analytical stack that transforms raw detection into actionable insight. Each layer builds on the one below. Signals produce findings; perspectives aggregate findings; theses weigh evidence; verdicts explain confirmed theses.

Is: the conceptual stack that organizes all analytical output. Is not: a physical pipeline — each layer is compiled independently from declarations to SQL.

Signal

A deliberately designed diagnostic query that examines the tenant’s validated data (Gold layer) to reveal a specific pattern. Signals are question-driven, limited in scope, repeatable, and comparable over time. Each signal produces standardized findings. Signals serve as evidence for theses.

Is: a repeatable, question-driven analytical intervention. Is not: an ad-hoc query or a report. Signals are defined declaratively, compiled to SQL, and produce contract-compliant output.

Finding

A single flagged anomaly produced by a signal. Every finding carries a severity (high / medium / low), the affected entity, the time period, estimated money at risk, and machine-readable evidence. Findings are the atomic unit of diagnostic output.

Is: a specific, attributable detection event. Is not: a “result” or “alert.” Use “finding” consistently — never “signal result”.

Perspective

A structured interpretation that aggregates findings from multiple signals into an entity-level health score. An perspective answers a broader question — e.g., “how healthy is this material’s billing lifecycle?” — by combining evidence from several narrower signals. Guidance, not truth.

Is: a structured, multi-signal aggregation with severity rules. Is not: a subjective interpretation. Perspectives are deterministic given their source signals.

Thesis

A natural-language business question that can be systematically evaluated using signals as evidence — e.g., “The organization is losing revenue because consumed items are not being invoiced.” Defined declaratively, compiled to dbt SQL. Each thesis links to signals via an evidence chain with weighted roles (primary, supporting, context, counter). The same thesis is evaluated per-tenant — it may be “confirmed” for one tenant and “not observed” for another.

Is: a testable business claim with evidence scoring. Is not: a guess or assumption. Every thesis is grounded in signal findings.

Verdict

A rule-based root cause explanation for a confirmed thesis. Verdicts answer “why is this happening?” by identifying the root cause category (process failure, system failure, data quality, behavioral, structural, or external) and providing actionable recommendations. Each verdict carries a confidence score (base + evidence boosts) and tri-lingual explanations.

Is: a causal explanation with confidence scoring. Is not: a treatment or action — verdicts explain, they don’t prescribe next steps (that’s the Treatment layer).

SMEbit

An atomic, attributed piece of subject matter expert knowledge — a single observation, insight, or known exception contributed by a named domain expert. SMEbits can optionally carry executable checks (SQL that validates the claim against data) and prescriptions (modeling directives for the pipeline). They can also serve as weighted evidence in theses, letting human knowledge and machine detection meet in the same scoring mechanism. Categories include data quality, mapping, business rule, process, system, seasonal, historical, and structural.

Is: structured, first-class expert knowledge with optional data validation. Is not: a comment, annotation, or wiki page. SMEbits are versioned artifacts with identity, lifecycle, and attribution.

BitBundle

A human-curated narrative grouping of related SMEbits. Where an SMEbit is atomic knowledge, a BitBundle is a story — it connects multiple insights into a coherent whole (e.g., “Implant tracking from OR to invoice”). Maintained by a named curator. Not executable — the narrative organizes, it does not evaluate. The display label is configurable per tenant (e.g., “Use Case”, “Dossier”).

Is: a narrative wrapper that connects related SMEbits. Is not: an perspective or automated grouping. Bundles are hand-curated by a named curator.

Verdict

The per-tenant evaluation of a thesis. Computed from signal findings using a weighted evidence score: confirmed (strong evidence supports it), plausible (some evidence, not conclusive), not observed (signals ran cleanly, no findings — good news), or insufficient (not enough data to evaluate). A primary-role signal must have findings for a thesis to reach “confirmed.”

Is: a deterministic evaluation result. Is not: a finding (findings come from signals; verdicts come from theses, verdicts, and SMEbit checks).

Evidence

In the thesis system: a weighted reference to a signal or SMEbit that contributes to a thesis verdict. Each evidence link has a role (primary, supporting, context, counter) and a numeric weight. In findings: machine-readable data attached to each finding that explains why it was flagged — typically the compared values, the deviation, and the threshold. Evidence makes findings auditable and challengeable.

Is: a scored link between a thesis and its supporting/contradicting data. Is not: the raw data itself. Evidence is the connection, not the finding.

Evidence Score

A numeric value (0.0–1.0) that summarizes the total weighted evidence for a thesis. Compared against thresholds to determine verdict status (confirmed > plausible > not_observed).

Health Score

An entity-level numeric score (0–100) produced by perspectives. Aggregates severity and count of findings across multiple signals for a single entity.

Severity

The urgency level of a finding: high (large financial exposure or compliance risk), medium (material but not critical), or low (minor deviation, worth monitoring). Severity thresholds are defined per signal.

Money at Risk

The estimated financial exposure of a finding — what the organization could recover or is losing. Calculated per finding (e.g., usage value minus billed amount) and aggregated across signals. This is an estimate, not an invoice.

Qty at Risk

The estimated quantity exposure of a finding — how many units are affected. Complements money at risk for signals where volume matters (e.g., unbilled items, missing inventory). Aggregated alongside money at risk in theses and reports.

Risk Tier

A tiered financial risk classification assigned to each signal. Used to prioritize findings in reports and the Explorer. Tiers indicate the magnitude of potential financial impact, from operational to strategic.

Entity

A domain object with identity — the thing being examined. The seven core entities are: Case, Procedure, Material, Cost Center, Material Movement, Case Material Usage, and Billing Event. Signals and findings always reference an entity type and entity ID.

Interpretation

A human-readable explanation of a finding — what happened, why it matters, and what to look at next. Generated from signal templates and finding evidence. Designed to be understood by domain stakeholders without technical background.

Calibration

The process of measuring how accurately signals detect known defects. Synthetic tenants inject defects at known rates and record every affected entity in a defect manifest. After a pipeline rebuild, calibrate.py compares signal findings against the manifest to compute recall (what fraction of injected defects the signal found), precision (what fraction of findings are true positives), and F1 (harmonic mean). Only available for synthetic tenants.

Instrument

Generic term for any declarative analytical artifact: signal, perspective, thesis, verdict, SMEbit, or BitBundle. All instruments share a common lifecycle: declare → validate (check script) → compile (compiler) → build (dbt).

Is: the umbrella term for the six artifact types. Is not: used in user-facing UI. The Explorer shows signals, theses, etc. by their specific names.

Registry

A dbt model (SQL table) that stores metadata for all instruments of a given type. Generated by the compiler from instrument definitions. Contains display text (tri-lingual), categories, tags, and configuration — but no analytical logic. Examples: signal_registry, thesis_registry, smebit_registry, bitbundle_registry.

Contract

A JSON schema that defines the required fields, types, and constraints for a layer’s output. Contracts enforce interface stability between layers. Includes gold_contract.v1.json, silver_contract.v1.json, findings_contract.v1.json, diagnosis_contract.v1.json, smebit_contract.v1.json, and bitbundle_contract.v1.json.

Root Cause

The underlying reason why a problem exists. Structured into six categories: process_failure (broken workflow), system_failure (IT integration gap), data_quality (stale or inconsistent master data), behavioral (human workarounds), structural (organizational misalignment), external (supplier or regulatory factors).

Treatment

A deliberate action intended to change the organization’s state in response to findings — process changes, staffing adjustments, policy updates. The diagnostic metaphor: signals diagnose, treatments intervene. (Not yet implemented in the system, but part of the domain model.)

Presentation

A fullscreen, executive-ready view of a thesis verdict. Includes an Executive Brief narrative, trend charts, calendar overlay with severity markers, and evidence chain visualization. Designed for stakeholder meetings and board-level reporting.

Report

A structured document generated from pipeline data. Four types: data quality, financial risk, analytics readiness, and health. Available as in-app PDF with download capability, in multiple languages.

B. Architecture Terms

These describe how the system is built and how data flows through it.

Medallion Architecture

The data architecture pattern: Bronze (raw intake, no interpretation), Silver (validated facts, every row flagged valid or invalid), Gold (certified data, only valid rows, the product contract). Each layer has a clear responsibility. No silent filtering before Gold.

Bronze

The raw intake layer of the medallion architecture. Maps source-system columns to canonical names. Materialized as TABLE. Adds source_file and row_number lineage. No interpretation, no filtering — traceability over correctness.

Silver

The validated facts layer. Normalized, type-cast, with is_valid flag and invalid_reason on every row. Invalid rows are flagged, not dropped. Contains facts, not decisions. Trimming, null normalization, enum and FK validation.

Gold

The consumption layer. Source-system-agnostic views filtering Silver to is_valid = true. Opinionated, question-driven, contextual. This is the product contract — signals operate exclusively on Gold.

Source-System Dispatch

The macro-based mechanism that maps source-system-specific column names to the canonical pack schema. Dispatch happens only in Bronze — from Silver onward, all models are source-system-agnostic. Each pack defines its own dispatch macros for its supported source systems.

Compilation vs Build

Two distinct phases in the instrument lifecycle. Compilation (signalcompile.py, hypothesiscompile.py, etc.) transforms YAML definitions into SQL models. Building (dbt build) executes those SQL models against DuckDB for each tenant. You compile once; you build per tenant.

Tenant

An organization (or organizational unit) with its own isolated data schema. Tenants share the same pipeline, models, and signals, but their data never touches. The platform layer unions across tenants for cross-tenant comparison.

Domain Pack

A self-contained analytical framework for a specific industry. Each pack bundles signals, perspectives, theses, verdicts, SMEbits, dbt models, contracts, and source-system adapters. Packs are starter kits — jinflow init --pack copies one into a tenant instance.

jinflow

The three-layer software architecture. jinflow.core provides the analytical engine (medallion architecture, contract system, compilation framework, instrument lifecycle). jinflow.erp adds ERP-specific structure (source-system dispatch, entity model, CSV ingestion, taxonomy framework). Domain packs add vertical expertise (specific signals, theses, verdicts, domain knowledge, Explorer UI).

Rebuildable

A table or view that can be fully recreated from source files (CSVs) and YAML definitions at any time. All Bronze, Silver, Gold, signal, thesis, verdict, and SMEbit models are rebuildable. Contrast with accumulating tables (snapshots, finding history) that grow over multiple builds.

Ephemeral Database

The DuckDB file (dev.duckdb) is disposable. rm dev.duckdb && ./scripts/rebuild.sh recreates everything from CSVs. This is by design — the database is a derived artifact, not a primary store.

Pipeline Graph

An interactive visualization of the entire dbt DAG — every model, source, and dependency rendered as a navigable graph. Supports layer toggles, upstream/downstream tracing, source code inspection, search with match navigation, and export as standalone HTML. Available at System → Pipeline.

C. Domain Terms (Pack-Specific)

These terms are defined by domain packs and vary by industry. Each pack defines its own domain vocabulary — the examples below show common patterns across packs.

Supply Chain / Material Flow

The end-to-end movement of physical materials or goods through an organization: procurement, receipt, storage, issue, usage, billing. The pipeline tracks and reconciles every step. Domain packs define their own entity models for this flow.

Item / Material

A supply item, product, or consumable tracked in an inventory or catalog system. Items have prices, groups, and validity periods. Each item has a unique identifier and belongs to classification groups. The specific term varies by pack (material, article, product).

Revenue Leakage

Items consumed in an operational case but never billed. A common financial risk that jinflow detects. Measured as money at risk (quantity x unit price of unbilled items).

Is: a gap between usage and billing records. Is not: necessarily an error — some unbilled usage is intentional (floor stock, standard supply). Signals detect the gap; theses determine if it’s a real problem.

I/O Coefficient

Input/Output ratio — the relationship between input cost and output revenue for a cost center or item group. Used to identify areas with disproportionate spend.

Cost Center

An organizational unit that accumulates costs — typically a department or operational area. Signals group findings by cost center, and many defect patterns (cross-CC billing, CC mismatch) are detected at this level. DE: Kostenstelle, FR: centre de couts.

Taxonomy

A hierarchical classification tree that organizes dimension members. Examples: organizational unit structure, product classification, operational groupings. Used for drill-down and aggregation.

D. Term Topology

Signal → Fact → Finding → Verdict → Action

The data flow through the system, from raw intake to business decision:

Signal (Bronze)         Raw measurement or event from a source system.
   ↓                    Ingested as-is, no interpretation.
Fact (Silver)           Validated statement about reality. Trimmed,
   ↓                    typed, flagged with is_valid.
Finding (Signal)         Detected pattern instance. Severity, entity,
   ↓                    money at risk, evidence JSON.
Verdict (Thesis)    Evidence-weighted business judgment.
   ↓                    confirmed / plausible / not_observed.
Action (Treatment)      Deliberate intervention to change the organization's
                        state. Process change, policy, staffing.

The Analytical Pyramid — Layer Ownership

┌──────────────────────────────────────────────────────────────────┐
│                    Domain Packs                                  │
│  Domain-specific signals, theses, verdicts, SMEbits         │
│  Healthcare, winemaking, logistics, legal, etc.                 │
├──────────────────────────────────────────────────────────────────┤
│                    jinflow.erp                                   │
│  Source-system dispatch, entity model, taxonomy framework        │
│  Source-system column mappings (per pack)                        │
├──────────────────────────────────────────────────────────────────┤
│                    jinflow.core                                  │
│  Medallion architecture, contract system, instrument lifecycle   │
│  Bronze → Silver → Gold → Platform                              │
│  Signal → Perspective → Thesis → Verdict → Treatment        │
└──────────────────────────────────────────────────────────────────┘

Instrument Relationships

Signal ──produces──▶ Finding
  │                    │
  │               aggregated by
  │                    ▼
  └──referenced──▶ Perspective ──produces──▶ Health Score
                       │
                  evidence for
                       ▼
                  Thesis ──produces──▶ Verdict
                       │
                  explained by
                       ▼
                  Verdict ──produces──▶ Root Cause + Recommendation

SMEbit ──────────────────────────▶ Verdict (Level 1 checks)
  │                                   │
  │                              evidence for
  │                                   ▼
  └──anchored to──▶ Signal        Thesis
                    Thesis
                    Entity

BitBundle ──groups──▶ SMEbit (narrative wrapper, no computation)

Treatments (Future)

Term	Definition	DDD Mapping
Treatment	A deliberate action to change the organization’s state (process change, staffing adjustment, policy update)	Command
Treatment Plan	Structured plan: objectives, scope, mechanism, timeframe, owners, constraints	Aggregate
Treatment Execution Event	Recorded fact that a treatment was applied	Domain Event
Impact Measure	Predefined metric to evaluate treatment effects (defined before execution)	Value Object
Outcome	Observed change after treatment, with metrics and confidence	Value Object
Feedback Loop	System capability to learn from outcomes and adapt future signals and treatments	—

Evolution

This glossary is designed to evolve. Terms may deepen in meaning, but should not drift. The strength of the system lies in a shared language that enables verdict, action, and learning — without blame or illusion of certainty.

jinflow is a jazzisnow product

v0.45.1 · built 2026-04-17 08:14 UTC