Signal YAML Reference
Field-level specification for signal (probe) and perspective (assessment) YAML definitions.
Overview
Section titled “Overview”| Property | Value |
|---|---|
| File location | probes/probe_*.yaml (signals) or probes/assessment_*.yaml (perspectives) |
| ID rule | probe_id must match filename (without .yaml) |
| Validator | python3 scripts/signalcheck.py |
| Compiler | python3 scripts/signalcompile.py |
| Compiler dry-run | python3 scripts/signalcompile.py --check |
| Registry compiler | python3 scripts/signalregistry.py |
Core Fields
Section titled “Core Fields”All fields in this table are required on every signal.
| Field | Type | Valid values |
|---|---|---|
probe_id | string | Must match filename; prefix probe_ or assessment_ |
version | string | Semantic version (e.g. "1.0.0") |
contract | string | "gold.v1" (signals), "findings.v1" (perspectives), "silver.v1" (silver_audit) |
type | string | One of the 13 types listed in Signal Types |
severity | string | high | medium | low |
description | string | Free-text explanation of what the signal detects |
Optional top-level fields:
| Field | Type | Description |
|---|---|---|
created_at | string | ISO date (YYYY-MM-DD) |
modified_at | string | ISO date (YYYY-MM-DD) |
evidence_fields | list | Field names included in the findings evidence JSON column |
Scope Block
Section titled “Scope Block”Required on all signals.
| Field | Type | Required | Description |
|---|---|---|---|
scope.entity_type | string | yes | Entity label for findings output (e.g. BillingEvent, Shipment) |
scope.group_by | list | yes | Fields that define one finding row (min 1) |
scope.time.entity | string | yes | Entity containing the time field |
scope.time.field | string | yes | Date/timestamp field name |
scope.time.bucket | string | yes | week | month | quarter | raw |
Registry Block (optional)
Section titled “Registry Block (optional)”Tri-lingual metadata for the signal registry. Auto-generated from the glossary if omitted.
| Field | Type | Valid values |
|---|---|---|
registry.probe_category | string | Freeform (e.g. inventory_integrity, customs_compliance) |
registry.risk_tier | string | direct_financial | compliance_exposure | operational_signal |
registry.confidence_weight | float | 0.0-1.0 |
registry.display_name | dict | {en, de, fr} |
registry.description | dict | {en, de, fr} |
registry.interpretation | dict | {en, de, fr} — supports {field} placeholders from evidence_fields |
Severity Rules
Section titled “Severity Rules”Standard form
Section titled “Standard form”Rules are evaluated top-down; first match wins. Must end with a default entry.
severity_rules: - above: 10.0 level: high - above: 2.0 level: medium - default: low| Field | Type | Description |
|---|---|---|
above | number | Threshold (exclusive) |
level | string | high | medium | low |
default | string | Fallback severity |
Some types support field to target a specific column:
severity_rules: - field: finding_count above: 50 level: highPerspective compound form
Section titled “Perspective compound form”Multiple conditions AND-joined within a single rule:
severity_rules: - conditions: - field: finding_count above: 50 - field: total_risk above: 10000 level: high - field: finding_count above: 10 level: medium - default: lowMoney at Risk
Section titled “Money at Risk”| Form | Example | Typical use |
|---|---|---|
| Expression | "abs(left_total - right_total)" | balance, ratio |
| Field reference | total_risk | perspective |
| Fixed amount | money_at_risk_fixed: 500 | mandatory_item |
Signal Types
Section titled “Signal Types”All 13 valid types and their required type-specific blocks:
| Type | Contract | Type-specific keys | Purpose |
|---|---|---|---|
balance | gold.v1 | left, right, join_key, tolerance_pct | Compare two aggregates per entity |
assessment | findings.v1 | source_probes | Aggregate findings across signals |
duplicate | gold.v1 | duplicate | Find duplicates by field combination |
reconciliation | gold.v1 | left, right, join, derived, flag | Cross-entity reconciliation |
temporal_sequence | gold.v1 | sequence | Validate event ordering |
distribution_outlier | gold.v1 | metric, distribution | Flag statistical outliers (z-score) |
ratio | gold.v1 | numerator, denominator, expected_ratio, tolerance_pct, direction | Ratio against expected value |
mandatory_item | gold.v1 | qualifying, required | Check required items exist |
trend | gold.v1 | metric, trend | Detect worsening trends |
silver_audit | silver.v1 | source, audit-specific fields | Audit Silver-layer data quality |
entity_filter | gold.v1 | filter | Filter entities by condition |
enrichment | gold.v1 | dimension, fact, derived, flag | Enrich entities with computed fields |
hand_written | gold.v1 | (none — reads probes/{probe_id}.sql) | Raw SQL escape hatch |
Type: balance
Section titled “Type: balance”Compares two aggregated sides per entity group. The compiler calculates balance_pct.
Left/right side fields:
| Field | Type | Required | Description |
|---|---|---|---|
left.entity / right.entity | string | yes | Contract entity name |
left.expression / right.expression | string | yes | SQL expression to aggregate |
left.aggregate / right.aggregate | string | yes | sum | count | avg | min | max |
left.alias / right.alias | string | yes | Column alias |
left.where / right.where | dict | no | Filter conditions ({} for none) |
Join and tolerance:
| Field | Type | Required | Description |
|---|---|---|---|
join_key | string or list | yes | Field(s) to join left and right |
tolerance_pct | float | yes | Percentage threshold for flagging |
Type: perspective
Section titled “Type: perspective”Aggregates findings from multiple signals. Must use contract: "findings.v1".
| Field | Type | Required | Description |
|---|---|---|---|
source_probes | list | yes | Signals to aggregate |
source_probes[].probe_id | string | yes | Must reference an existing signal YAML |
source_probes[].weight | integer | yes | Relative weight for scoring |
Aggregate columns available for severity rules: probe_count, finding_count, total_risk, worst_severity, probes_flagged.
Type: duplicate
Section titled “Type: duplicate”Finds records where match fields are identical but a conflict field differs.
| Field | Type | Required | Description |
|---|---|---|---|
duplicate.entity | string | yes | Contract entity name |
duplicate.match_fields | list | yes | Fields identifying “same” records |
duplicate.conflict_field | string | yes | Field that should be consistent |
duplicate.min_distinct | integer | no | Min distinct conflict values to flag (default: 2) |
Type: reconciliation
Section titled “Type: reconciliation”Cross-entity reconciliation with independent aggregation, join, and derived metrics.
| Field | Type | Required | Description |
|---|---|---|---|
left.entity / right.entity | string | yes | Contract entity name |
left.group_by / right.group_by | list | yes | Group-by fields |
left.aggregates / right.aggregates | list | yes | [{aggregate, expression, alias}] |
left.where / right.where | dict | no | Filter conditions |
join.keys | list | yes | Join key fields |
join.time_key | string | yes | Time bucket column |
join.type | string | yes | full_outer | left | inner |
derived | list | yes | [{expression, alias}] computed columns |
flag | string | yes | SQL WHERE for flagging |
entity_id_field | string | yes | Field used as entity ID |
time_bucket_field | string | yes | Time bucket field |
Type: temporal_sequence
Section titled “Type: temporal_sequence”Validates expected ordering or presence of events in a sequence.
| Field | Type | Required | Description |
|---|---|---|---|
sequence.entity | string | yes | Entity containing actual events |
sequence.order_field | string | yes | Timestamp/ordering field |
sequence.group_by | string | yes | Group key (e.g. shipment_id) |
sequence.expected_steps.source | string | yes | Entity defining expected sequence |
sequence.expected_steps.match_field | string | yes | Field matching expected to actual |
sequence.expected_steps.step_field | string | yes | Step ordering field |
sequence.expected_steps.location_field | string | no | Location field for spatial comparison |
sequence.actual_steps.field | string | yes | Actual field to compare |
Remaining Types (brief)
Section titled “Remaining Types (brief)”distribution_outlier — Flags statistical outliers. Requires metric (entity, expression, alias) and distribution (method: zscore, threshold, baseline_group).
ratio — Compares numerator/denominator against expected ratio. Requires numerator and denominator (each: entity, expression, aggregate, alias), expected_ratio, tolerance_pct, and optional direction (above | below | both).
mandatory_item — Checks qualifying entities have required items. Requires qualifying (entity, join_key, where) and required (entity, join_key, min_count). Uses money_at_risk_fixed.
trend — Detects worsening trends via rolling metrics. Requires metric and trend blocks with rolling window configuration.
silver_audit — Audits Silver-layer data quality. Uses contract: "silver.v1". Two variants: validity (filters by is_valid, groups by invalid_reason) and activity (joins to activity tables, filters on aggregates).
entity_filter — Filters entities by SQL WHERE condition. Requires source with entity, where clause, entity_id_field.
enrichment — Joins fact to filtered dimension, computes derived metrics. Requires dimension, fact, derived, flag.
hand_written — SQL escape hatch. No type-specific YAML keys. Reads probes/{probe_id}.sql which must emit the standard findings columns.
Validation Rules
Section titled “Validation Rules”probe_idmust match the YAML filename (without.yaml).- All entity/field references are validated against the contract’s entity definitions.
severity_rulesmust end with adefaultentry.scope.time.bucketmust be one of:week,month,quarter,raw.registry.risk_tiermust be one of:direct_financial,compliance_exposure,operational_signal.- Expression strings are validated for safe characters (alphanumeric,
_,*,+,-,/,(,),., space). - All
probe_idreferences in perspectives must have a corresponding YAML file. - Aggregate functions must be one of:
sum,count,avg,min,max.
Minimal Example (balance)
Section titled “Minimal Example (balance)”probe_id: probe_warehouse_balanceversion: "1.0.0"contract: "gold.v1"type: balanceseverity: highdescription: > Compares inventory snapshot quantities against cumulative inbound minus outbound movements.
scope: entity_type: InventorySnapshot group_by: [warehouse_id, item_id] time: entity: InventorySnapshot field: snapshot_date bucket: month
left: entity: InventorySnapshot expression: "quantity" aggregate: sum alias: snapshot_quantity where: {}
right: entity: Checkpoint expression: "CASE WHEN direction = 'inbound' THEN quantity ELSE -quantity END" aggregate: sum alias: movement_balance where: checkpoint_type: warehouse
join_key: [warehouse_id, item_id]tolerance_pct: 2.0
severity_rules: - above: 10.0 level: high - above: 2.0 level: medium - default: low
money_at_risk: "abs(snapshot_quantity - movement_balance) * avg_item_value"
evidence_fields: [warehouse_id, item_id, snapshot_quantity, movement_balance, balance_pct]Findings Output Contract
Section titled “Findings Output Contract”Every compiled signal emits rows with these columns:
| Column | Type | Description |
|---|---|---|
finding_id | string | Deterministic surrogate key |
tenant_id | string | Tenant identifier |
probe_id | string | Signal identifier |
probe_version | string | Semantic version |
severity | string | high | medium | low |
entity_type | string | From scope.entity_type |
entity_id | string | Entity identifier |
time_bucket | string | Formatted time bucket |
money_at_risk | numeric | Financial exposure |
evidence | string | JSON object with signal-specific fields |