Skip to content

Signal YAML Reference

Field-level specification for signal (probe) and perspective (assessment) YAML definitions.

PropertyValue
File locationprobes/probe_*.yaml (signals) or probes/assessment_*.yaml (perspectives)
ID ruleprobe_id must match filename (without .yaml)
Validatorpython3 scripts/signalcheck.py
Compilerpython3 scripts/signalcompile.py
Compiler dry-runpython3 scripts/signalcompile.py --check
Registry compilerpython3 scripts/signalregistry.py

All fields in this table are required on every signal.

FieldTypeValid values
probe_idstringMust match filename; prefix probe_ or assessment_
versionstringSemantic version (e.g. "1.0.0")
contractstring"gold.v1" (signals), "findings.v1" (perspectives), "silver.v1" (silver_audit)
typestringOne of the 13 types listed in Signal Types
severitystringhigh | medium | low
descriptionstringFree-text explanation of what the signal detects

Optional top-level fields:

FieldTypeDescription
created_atstringISO date (YYYY-MM-DD)
modified_atstringISO date (YYYY-MM-DD)
evidence_fieldslistField names included in the findings evidence JSON column

Required on all signals.

FieldTypeRequiredDescription
scope.entity_typestringyesEntity label for findings output (e.g. BillingEvent, Shipment)
scope.group_bylistyesFields that define one finding row (min 1)
scope.time.entitystringyesEntity containing the time field
scope.time.fieldstringyesDate/timestamp field name
scope.time.bucketstringyesweek | month | quarter | raw

Tri-lingual metadata for the signal registry. Auto-generated from the glossary if omitted.

FieldTypeValid values
registry.probe_categorystringFreeform (e.g. inventory_integrity, customs_compliance)
registry.risk_tierstringdirect_financial | compliance_exposure | operational_signal
registry.confidence_weightfloat0.0-1.0
registry.display_namedict{en, de, fr}
registry.descriptiondict{en, de, fr}
registry.interpretationdict{en, de, fr} — supports {field} placeholders from evidence_fields

Rules are evaluated top-down; first match wins. Must end with a default entry.

severity_rules:
- above: 10.0
level: high
- above: 2.0
level: medium
- default: low
FieldTypeDescription
abovenumberThreshold (exclusive)
levelstringhigh | medium | low
defaultstringFallback severity

Some types support field to target a specific column:

severity_rules:
- field: finding_count
above: 50
level: high

Multiple conditions AND-joined within a single rule:

severity_rules:
- conditions:
- field: finding_count
above: 50
- field: total_risk
above: 10000
level: high
- field: finding_count
above: 10
level: medium
- default: low
FormExampleTypical use
Expression"abs(left_total - right_total)"balance, ratio
Field referencetotal_riskperspective
Fixed amountmoney_at_risk_fixed: 500mandatory_item

All 13 valid types and their required type-specific blocks:

TypeContractType-specific keysPurpose
balancegold.v1left, right, join_key, tolerance_pctCompare two aggregates per entity
assessmentfindings.v1source_probesAggregate findings across signals
duplicategold.v1duplicateFind duplicates by field combination
reconciliationgold.v1left, right, join, derived, flagCross-entity reconciliation
temporal_sequencegold.v1sequenceValidate event ordering
distribution_outliergold.v1metric, distributionFlag statistical outliers (z-score)
ratiogold.v1numerator, denominator, expected_ratio, tolerance_pct, directionRatio against expected value
mandatory_itemgold.v1qualifying, requiredCheck required items exist
trendgold.v1metric, trendDetect worsening trends
silver_auditsilver.v1source, audit-specific fieldsAudit Silver-layer data quality
entity_filtergold.v1filterFilter entities by condition
enrichmentgold.v1dimension, fact, derived, flagEnrich entities with computed fields
hand_writtengold.v1(none — reads probes/{probe_id}.sql)Raw SQL escape hatch

Compares two aggregated sides per entity group. The compiler calculates balance_pct.

Left/right side fields:

FieldTypeRequiredDescription
left.entity / right.entitystringyesContract entity name
left.expression / right.expressionstringyesSQL expression to aggregate
left.aggregate / right.aggregatestringyessum | count | avg | min | max
left.alias / right.aliasstringyesColumn alias
left.where / right.wheredictnoFilter conditions ({} for none)

Join and tolerance:

FieldTypeRequiredDescription
join_keystring or listyesField(s) to join left and right
tolerance_pctfloatyesPercentage threshold for flagging

Aggregates findings from multiple signals. Must use contract: "findings.v1".

FieldTypeRequiredDescription
source_probeslistyesSignals to aggregate
source_probes[].probe_idstringyesMust reference an existing signal YAML
source_probes[].weightintegeryesRelative weight for scoring

Aggregate columns available for severity rules: probe_count, finding_count, total_risk, worst_severity, probes_flagged.

Finds records where match fields are identical but a conflict field differs.

FieldTypeRequiredDescription
duplicate.entitystringyesContract entity name
duplicate.match_fieldslistyesFields identifying “same” records
duplicate.conflict_fieldstringyesField that should be consistent
duplicate.min_distinctintegernoMin distinct conflict values to flag (default: 2)

Cross-entity reconciliation with independent aggregation, join, and derived metrics.

FieldTypeRequiredDescription
left.entity / right.entitystringyesContract entity name
left.group_by / right.group_bylistyesGroup-by fields
left.aggregates / right.aggregateslistyes[{aggregate, expression, alias}]
left.where / right.wheredictnoFilter conditions
join.keyslistyesJoin key fields
join.time_keystringyesTime bucket column
join.typestringyesfull_outer | left | inner
derivedlistyes[{expression, alias}] computed columns
flagstringyesSQL WHERE for flagging
entity_id_fieldstringyesField used as entity ID
time_bucket_fieldstringyesTime bucket field

Validates expected ordering or presence of events in a sequence.

FieldTypeRequiredDescription
sequence.entitystringyesEntity containing actual events
sequence.order_fieldstringyesTimestamp/ordering field
sequence.group_bystringyesGroup key (e.g. shipment_id)
sequence.expected_steps.sourcestringyesEntity defining expected sequence
sequence.expected_steps.match_fieldstringyesField matching expected to actual
sequence.expected_steps.step_fieldstringyesStep ordering field
sequence.expected_steps.location_fieldstringnoLocation field for spatial comparison
sequence.actual_steps.fieldstringyesActual field to compare

distribution_outlier — Flags statistical outliers. Requires metric (entity, expression, alias) and distribution (method: zscore, threshold, baseline_group).

ratio — Compares numerator/denominator against expected ratio. Requires numerator and denominator (each: entity, expression, aggregate, alias), expected_ratio, tolerance_pct, and optional direction (above | below | both).

mandatory_item — Checks qualifying entities have required items. Requires qualifying (entity, join_key, where) and required (entity, join_key, min_count). Uses money_at_risk_fixed.

trend — Detects worsening trends via rolling metrics. Requires metric and trend blocks with rolling window configuration.

silver_audit — Audits Silver-layer data quality. Uses contract: "silver.v1". Two variants: validity (filters by is_valid, groups by invalid_reason) and activity (joins to activity tables, filters on aggregates).

entity_filter — Filters entities by SQL WHERE condition. Requires source with entity, where clause, entity_id_field.

enrichment — Joins fact to filtered dimension, computes derived metrics. Requires dimension, fact, derived, flag.

hand_written — SQL escape hatch. No type-specific YAML keys. Reads probes/{probe_id}.sql which must emit the standard findings columns.

  • probe_id must match the YAML filename (without .yaml).
  • All entity/field references are validated against the contract’s entity definitions.
  • severity_rules must end with a default entry.
  • scope.time.bucket must be one of: week, month, quarter, raw.
  • registry.risk_tier must be one of: direct_financial, compliance_exposure, operational_signal.
  • Expression strings are validated for safe characters (alphanumeric, _, *, +, -, /, (, ), ., space).
  • All probe_id references in perspectives must have a corresponding YAML file.
  • Aggregate functions must be one of: sum, count, avg, min, max.
probe_id: probe_warehouse_balance
version: "1.0.0"
contract: "gold.v1"
type: balance
severity: high
description: >
Compares inventory snapshot quantities against cumulative inbound
minus outbound movements.
scope:
entity_type: InventorySnapshot
group_by: [warehouse_id, item_id]
time:
entity: InventorySnapshot
field: snapshot_date
bucket: month
left:
entity: InventorySnapshot
expression: "quantity"
aggregate: sum
alias: snapshot_quantity
where: {}
right:
entity: Checkpoint
expression: "CASE WHEN direction = 'inbound' THEN quantity ELSE -quantity END"
aggregate: sum
alias: movement_balance
where:
checkpoint_type: warehouse
join_key: [warehouse_id, item_id]
tolerance_pct: 2.0
severity_rules:
- above: 10.0
level: high
- above: 2.0
level: medium
- default: low
money_at_risk: "abs(snapshot_quantity - movement_balance) * avg_item_value"
evidence_fields: [warehouse_id, item_id, snapshot_quantity, movement_balance, balance_pct]

Every compiled signal emits rows with these columns:

ColumnTypeDescription
finding_idstringDeterministic surrogate key
tenant_idstringTenant identifier
probe_idstringSignal identifier
probe_versionstringSemantic version
severitystringhigh | medium | low
entity_typestringFrom scope.entity_type
entity_idstringEntity identifier
time_bucketstringFormatted time bucket
money_at_risknumericFinancial exposure
evidencestringJSON object with signal-specific fields
jazzisnow jinflow is a jazzisnow product
v0.45.1 · built 2026-04-17 08:14 UTC