Skip to content

SMEbit YAML Reference

Field-level specification for SMEbit YAML definitions (smebits/smebit_*.yaml).

  • Location: smebits/smebit_*.yaml
  • smebit_id must match the filename stem (e.g. smebit_foo.yaml requires smebit_id: smebit_foo)
  • Validator: python3 scripts/smebitcheck.py
  • Compiler: python3 scripts/smebitcompile.py
LevelNameHas check block?Produces dbt model?Description
0ObservationNoRegistry onlyStandalone knowledge — no SQL, no verdict
1CheckYesVerdict modelTestable assertion compiled to SQL, produces confirmed/violated/no_data

Level is implicit: if the check block is present, it is Level 1. Otherwise Level 0.

FieldTypeRequiredDescription
smebit_idstringYesMust match filename stem
versionstringYesSemver (e.g. "1.0.0")
created_atstringWarningISO date ("2026-03-17"). Validator warns if missing
modified_atstringNoISO date. Updated on change
providermappingYesSee Provider
scopemappingYesSee Scope
categorystringYesSee Valid Categories
tagslist of stringsNoFree-form tags for discoverability
subjecti18n mappingYesOne-line summary (en/de/fr required)
contenti18n mappingYesFull explanation (en/de/fr required)
whyi18n mappingWarningReason behind the observation (en/de/fr). Validator warns if missing but does not block
statusstringYesSee Valid Statuses
superseded_bystringConditionalRequired when status: superseded. Must reference an existing smebit_id
anchorslistNoSee Anchors
checkmappingNoPresence makes this Level 1. See Check Block
prescriptionmappingNoSee Prescription Block

All i18n fields are mappings with keys en, de, fr (all three required).

FieldTypeRequiredDescription
namestringYesPerson or team name
rolestringYesRole or title
datedateYesDate of contribution (YYYY-MM-DD)
FieldTypeRequiredDescription
tenant_idstring or nullNoTenant this applies to. null = cross-tenant
source_systemstringNoSource system constraint
time_rangemappingNofrom and until (dates or null)
CategoryWhat it covers
data_qualityIncorrect or inconsistent values in source data
mappingWrong or outdated relationships between entities
business_ruleKnown exceptions to standard patterns
processHow the organization actually works (workarounds, manual steps)
systemKnown IT system limitations or behaviors
seasonalTime-based patterns the SME knows about
historicalPast events that explain current data patterns
structuralSchema or modeling insights
StatusDescription
activeCurrent, in effect
supersededReplaced by another SMEbit (requires superseded_by)
contestedUnder dispute or review
archivedNo longer relevant

Optional list of cross-references. Each entry must have at least one non-null field.

FieldTypeRequiredDescription
probe_idstringNoMust reference an existing probes/*.yaml
hypothesis_idstringNoMust reference an existing hypotheses/*.yaml
entity_typestringNoGold contract entity type
notestringNoFree-text explanation of the link

Present only on Level 1 SMEbits. Defines a testable SQL assertion.

FieldTypeRequiredDescription
descriptioni18n mappingYesWhat the check verifies (at least en required)
entity_typestringYesGold contract entity type
querystringYesSQL query using {{ ref('...') }} to reference dbt models
expectstringYesconfirmed or violated

The query can {{ ref() }} any layer (Bronze, Silver, Gold, signals). The verdict is confirmed if the query returns rows and expect: confirmed, or violated if it returns rows and expect: violated. no_data if the query returns no rows.

Optional. Recommends a corrective action.

FieldTypeRequiredDescription
typestringYesSee Prescription Types
target_entitystringYesEntity to modify
target_fieldstringYesField to modify
descriptioni18n mappingYesWhat should be done (at least en required)
statusstringYesproposed, accepted, or implemented

normalize_format, decompose_key, add_dimension, add_validation, reclassify, correct_mapping, redefine_grain, fix_ingestion

Level 0 — Observation (no check, registry only)

Section titled “Level 0 — Observation (no check, registry only)”
smebit_id: smebit_shanghai_customs_48h
version: "1.0.0"
created_at: "2026-03-17"
modified_at: "2026-03-17"
provider:
name: Trade Compliance Team
role: Customs Specialist
date: 2026-03-17
scope:
tenant_id: pacific_trade
time_range:
from: null
until: null
category: business_rule
tags: [customs, shanghai]
subject:
en: "Shanghai port holds containers for 48h minimum inspection for certain HS categories"
de: "Der Hafen Shanghai haelt Container fuer bestimmte HS-Kategorien mindestens 48h zurueck"
fr: "Le port de Shanghai retient les conteneurs pour une inspection minimale de 48h"
why:
en: "Chinese customs regulation for high-value and controlled goods categories."
de: "Chinesische Zollvorschrift fuer hochwertige und kontrollierte Warenkategorien."
fr: "Reglementation douaniere chinoise pour les marchandises de haute valeur."
content:
en: |
Electronics (HS 84-85) and pharmaceutical products (HS 30) entering
through Shanghai Yangshan port are subject to mandatory 48-hour
inspection holds. This is a regulatory requirement, not a processing
delay.
de: |
Elektronik (HS 84-85) und pharmazeutische Produkte (HS 30), die ueber
den Hafen Shanghai Yangshan eingefuehrt werden, unterliegen einer
obligatorischen 48-Stunden-Inspektionssperre.
fr: |
Les produits electroniques (SH 84-85) et pharmaceutiques (SH 30)
entrant par le port de Shanghai Yangshan sont soumis a une retenue
d'inspection obligatoire de 48 heures.
status: active
anchors:
- probe_id: probe_dwell_time_anomaly
note: "48h hold is regulatory, not anomalous"

Level 1 — Check (produces verdict model)

Section titled “Level 1 — Check (produces verdict model)”
smebit_id: smebit_acme_item_638876a_wrong_description
version: "1.0.0"
created_at: "2026-03-10"
modified_at: "2026-03-10"
provider:
name: Materials Management
role: Data Steward
date: 2026-03-10
scope:
tenant_id: acme_corp
time_range:
from: null
until: null
category: data_quality
tags: [item, description]
subject:
en: "Item 638876a has wrong description in master data"
de: "Artikel 638876a hat falsche Beschreibung in Stammdaten"
fr: "L'article 638876a a une mauvaise description dans les donnees de base"
why:
en: "Data entry error during initial catalogue import."
de: "Dateneingabefehler beim initialen Katalogimport."
fr: "Erreur de saisie lors de l'import initial du catalogue."
content:
en: |
Item 638876a is labelled "Widget A" but actually
refers to Widget B. The description was swapped with item
638876b during the 2024 catalogue migration.
de: |
Artikel 638876a ist als "Widget A" beschriftet,
bezieht sich aber tatsaechlich auf Widget B.
fr: |
L'article 638876a est libelle "Widget A" mais
designe en realite Widget B.
status: active
check:
description:
en: "Verify item 638876a still has wrong description"
de: "Pruefen ob Artikel 638876a noch falsche Beschreibung hat"
fr: "Verifier que l'article 638876a a encore la mauvaise description"
entity_type: Material
query: >
SELECT material_id, description
FROM {{ ref('gold_materials') }}
WHERE material_id = '638876a'
AND description ILIKE '%widget a%'
expect: confirmed
prescription:
type: correct_mapping
target_entity: Material
target_field: description
description:
en: "Update description from 'Widget A' to 'Widget B'"
de: "Beschreibung von 'Widget A' zu 'Widget B' aendern"
fr: "Mettre a jour la description de 'Widget A' a 'Widget B'"
status: proposed
anchors:
- probe_id: probe_catalogue_accuracy
note: "This item will appear as a catalogue mismatch"
jazzisnow jinflow is a jazzisnow product
v0.45.1 · built 2026-04-17 08:14 UTC