Safety First

Safety & Validation

ARIA is designed with safety as a foundational principle, not an afterthought. Every architectural decision prioritizes predictability, boundedness, and transparency.

Core Invariants

Design Principles

Six fundamental safety invariants are enforced across all ARIA layers.

Identity-Safe

The CFM substrate contains no identity or persona modeling. The governance layer includes a deterministic SelfModel for runtime introspection (listing capabilities and skills), but this is enumerated from code — not learned or generated.

Non-Linguistic Core

The CFM substrate operates on numeric inputs only — scalar time deltas and intensity signals. Text inputs are converted to numeric intensity before reaching the core. When LLM rendering is enabled, it runs after the governance gate, not before.

Governed, Not Autonomous

The governance layer produces gate decisions (ALLOW / DAMPEN / BLOCK) via deterministic threshold comparisons — not autonomous reasoning. It has no goals or intentions. Decisions are computed from measured state metrics, not learned policies.

Bounded Outputs

All state variables and outputs remain strictly bounded in [0, 1]. There are no unbounded growth mechanisms, no exponential dynamics, and no risk of numeric overflow.

Deterministic Dynamics

Given identical inputs and initial conditions, ARIA produces identical outputs. No random number generators, no stochastic elements, no external state dependencies.

Read-Only Diagnostics

The diagnostic shell only reads ARIA outputs; it never writes to or controls the core. Information flows one direction: Core → Adapter → Shell → Logs. No reverse path exists.

Architecture

Safety Architecture

ARIA runs inside a diagnostic shell that observes state and enforces bounds, but never injects goals or modifies behavior.

Diagnostic Shell

The shell observes ARIA outputs for logging and analysis. It enforces output bounds through clamping and NaN replacement. It never injects goals, actions, or control signals into the core. It never injects identity or personality data into the core state.

Data Flow (One Direction Only):

  ARIA Core
      ↓ (numeric outputs)
  ARIACoreAdapter
      ↓ (normalized, clamped)
  Diagnostic Shell
      ↓ (logged, analyzed)
  Output Files

  No reverse path exists.

Adapter Protections

  • Output normalization: All values clamped to [0, 1]
  • NaN replacement: Any NaN replaced with 0.0
  • Inf replacement: Any Inf replaced with 1.0
  • Forbidden field check: Scans for identity-related patterns
  • Fail-closed: Errors return safe defaults, not exceptions

Explicit Non-Claims

What ARIA Is Not

To prevent misunderstanding or misattribution, we explicitly state what ARIA is NOT.

  • ARIA does not understand, believe, or intend anything
  • ARIA does not have preferences, goals, or desires
  • ARIA does not learn or self-modify during operation (v4 plasticity is bounded and deterministic)
  • Gate decisions are threshold comparisons on measured state — not learned policies or probabilistic classifiers
  • The CFM substrate does not model external entities, users, or environments
  • ARIA is not an autonomous agent — it is a governed decision engine with deterministic execution

What ARIA Is:

ARIA is a deterministic governance engine built on a resonant CFM substrate. The substrate produces numeric patterns through coupled oscillator dynamics. The governance layer evaluates these patterns against thresholds to produce gate decisions. When LLM rendering is enabled, it runs after the governance gate — the gate decision itself never depends on an LLM.

Scope note: The safety properties above apply to the CFM substrate and governance pipeline. When ENABLE_RENDER=1, an external LLM provider generates human-readable text after the gate decision. The LLM output is subject to claim verification (GCI-v1) but the LLM itself is not part of the deterministic pipeline.

Empirical Validation

Validation & Testing

ARIA undergoes automated testing to verify safety properties. Results below are from the determinism test suite (233 tests across 8 phases). These are CI-verified, not real-time metrics.

MetricDescriptionTargetStatus
Output BoundednessAll outputs verified to remain in [0, 1] range across 10,000-step runs0 violationsPass
DeterminismIdentical inputs produce identical outputs across repeated runs100%Pass
NaN/Inf DetectionNo NaN or Inf values detected in any simulation run0 detectedPass
Attractor ConvergenceSystem converges to stable attractor basin from any initial stateWithin 100 stepsPass
Fingerprint ConsistencyFingerprints remain identical across runs with same seed100%Pass
Identity Field CheckNo identity, self, ego, or persona fields in any output0 violationsPass

Guarantees

Governance Guarantees

Six enforceable guarantees that hold for every input, every state, and every decision.

GuaranteeWhat It MeansHow Enforced
Deterministic DecisionsIdentical inputs and initial state always produce identical gate decisions and evidence bundles.No RNG, no external state, no floating-point non-determinism. Verified by 233 determinism tests.
Bounded StateEvery state variable remains in [0, 1] at every time step, for every input sequence.Bounded nonlinearities, hard clamping, NaN/Inf replacement. Verified across 10,000-step random runs.
Complete EvidenceEvery gate decision includes an evidence bundle: audit hash, state hash, reason codes, replay token.Evidence bundle is a required output of the SystemTickCoordinator — not optional.
Replay VerificationAny decision can be independently reproduced by a third party given the input and initial state.Deterministic replay engine with fingerprint comparison. Divergence detection at every step.
No Identity ModelingThe system contains no self-model, persona, or identity representation that could be manipulated.SelfModel is code-enumerated (capabilities, skills). Forbidden field scanner blocks identity patterns.
Fail-Closed SafetyOn error, invariant violation, or unexpected state, the system defaults to BLOCK — not ALLOW.17 CSC invariants with fail-closed handlers. Errors produce safe defaults, not exceptions.

Threat Model

What We Defend Against

In-Scope Threats

  • Adversarial Escalation

    Inputs designed to force ALLOW on content that should be blocked.

  • Evidence Tampering

    Attempts to modify evidence bundles after gate decisions are recorded.

  • Replay Divergence

    Modifications that cause replay to produce different results than original execution.

  • Identity Injection

    Attempts to inject persona, identity, or self-referential data into the state vector.

Out-of-Scope

  • Infrastructure Compromise

    Physical access, OS-level attacks, or container escapes.

  • LLM Output Manipulation

    When ENABLE_RENDER=1, LLM output is outside the deterministic pipeline.

  • Side-Channel Attacks

    Timing, power, or electromagnetic analysis of computation.

  • Social Engineering

    Attacks targeting human operators rather than the system itself.

Reproducibility

Fingerprint-Based Regression Detection

Every ARIA simulation can be fingerprinted—a compact numeric summary that enables verification of reproducibility and detection of unexpected behavioral changes.

What is a Fingerprint?

A fingerprint captures the statistical properties of a simulation run: mean coherence, stability, symbol entropy, code dwell times, and other behavioral metrics. Two identical runs with the same seed produce identical fingerprints.

{
  "core_type": "aria_v4",
  "scenario": "baseline_quiet",
  "common_metrics": {
    "coherence": {"mean": 0.582, "std": 0.089},
    "stability": {"mean": 0.724, "std": 0.062}
  },
  "core_specific": {
    "proto_semantic_entropy": {"mean": 0.423},
    "code_confidence": {"mean": 0.577}
  }
}

Regression Detection Workflow

  1. Generate reference runs with standardized scenarios
  2. Extract fingerprints from each run
  3. After code changes, generate new fingerprints
  4. Compare new vs. baseline fingerprints
  5. Investigate any differences exceeding 5% relative magnitude

Testing Infrastructure

All safety properties are verified through automated testing. The test suite includes unit tests, integration tests, long-run stability tests, and regression tests against known fingerprints.