All Documents

ARIA v4 Specification

Proto-semantic layer: pattern extraction, 16-code codebook, temporal stabilization, very slow plasticity.

February 2026PDFDOCX

ARIA Core v4 Specification: Proto-Semantic Layer

Version: 1.0 Status: Design Specification (Not Yet Implemented) Last Updated: 2025-12-06


1. Executive Summary

ARIA Core v4 is the Proto-Semantic Layer - a numeric coding system that creates stable internal "meaning bundles" by integrating symbol activations (v1), system state (v2), and relational structure (v3). It produces a fixed set of M proto-semantic codes representing recurring patterns in the underlying relational-symbolic dynamics.

Critical Clarification: ARIA Core v4 is:

  • Pre-semantic: These are NOT human concepts, words, or external referents
  • Non-linguistic: No words, sentences, tokens, or language structures
  • Non-conscious: No awareness, experience, understanding, or narrative
  • Non-agentic: No goals, plans, intentions, or decisions
  • Pre-narrative: No stories, no temporal self-model, no autobiographical content

ARIA Core v4 is analogous to:

  • A learned codebook over internal relational dynamics (like VQ-VAE codes, but simpler)
  • A set of "internal pattern signatures" summarizing recurring relational configurations
  • Compressed numeric fingerprints of anonymous symbol-relation-state bundles
  • Cluster centroids in a derived feature space with temporal smoothing

What v4 codes represent:

  • "When symbols k_2 and k_5 are co-active with strong reciprocal relations and high system stability, pattern P_7 tends to activate"
  • This is purely structural — P_7 has no meaning beyond its numeric signature

What v4 codes do NOT represent:

  • Human concepts ("dog", "happiness", "self")
  • External objects or entities
  • Personal identity or biographical content
  • Goals, values, or preferences
  • Linguistic tokens or embeddings

2. Architectural Position

2.1 Layer Hierarchy

┌─────────────────────────────────────────────────────────────────┐
│                    ARIA Core v4 (This Spec)                     │
│                     Proto-Semantic Layer                        │
│       Creates stable meaning codes from relational patterns     │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v3                              │
│                  Relational Symbolic Layer                      │
│        8×8 Symbol Relation Graph + 12D RSV                      │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v2                              │
│                   System State Layer                            │
│        12D System State Vector + Change Detection               │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v1                              │
│                   Proto-Symbolic Layer                          │
│           8 Symbol Activations + Temporal Dynamics              │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v0                              │
│                Proto-Conceptual Attractor Engine                │
│          5D Latent Channels + 4 Attractor Clusters              │
├─────────────────────────────────────────────────────────────────┤
│                       CFM Core v2                               │
│                Multi-Channel Field Substrate                    │
│              5 Coupled Oscillator Channels                      │
└─────────────────────────────────────────────────────────────────┘

2.2 Data Flow

v1 Outputs ─────────────────────────────┐
  symbol_activations[8]                 │
  symbol_entropy                        │
  symbol_stability                      │
  dominant_symbol                       │
                                        │
v2 Outputs ─────────────────────────────┼──► Pattern      ──► Code         ──► Proto-Semantic
  system_state_vector[12]               │    Extractor       Assignment       Activations[M]
  state_stability                       │    (PE)            Layer (CAL)
  drift_index                           │                         │
  coherence_index                       │                         │
                                        │                         ▼
v3 Outputs ─────────────────────────────┤              Temporal Meaning
  relation_matrix[8×8]                  │              Stabilizer (TMS)
  rsv[12]                               │                         │
  relation_entropy                      │                         ▼
  clustering_index                      │              Proto-Semantic
  reciprocity_index                     │              Plasticity (PSP)
                                        │                    │
                                        │                    ▼
Proto-Semantic Codebook (PSC) ◄─────────┴──────────── [slow update]

2.3 Wrapping Pattern

ARIA Core v4 wraps ARIA Core v3 (which wraps v2, v1, v0, CFM v2):

v4.step(dt) → v3.step(dt) → v2.step(dt) → v1.step(dt) → v0.step(dt) → cfm.step(dt)

Each layer receives the complete output from its substrate and adds its own computations.


3. Mathematical Constants

All constants are derived from φ (golden ratio), ψ (tribonacci constant), e (Euler's number), and π.

3.1 Primary Constants

ConstantSymbolValueDerivation
Golden Ratioφ1.6180339887(1 + √5) / 2
Inverse Goldenφ⁻¹0.61803398871 / φ
φ-squared inverseφ⁻²0.38196601131 / φ²
φ-cubed inverseφ⁻³0.23606797751 / φ³
φ-fourth inverseφ⁻⁴0.14589803381 / φ⁴
φ-fifth inverseφ⁻⁵0.09016994371 / φ⁵
Tribonacciψ1.8392867552Real root of x³ = x² + x + 1
Inverse Tribonacciψ⁻¹0.54368901271 / ψ
Euler's numbere2.7182818285lim(1 + 1/n)ⁿ
Inverse ee⁻¹0.36787944121 / e
Piπ3.1415926536Circle ratio

3.2 v4-Specific Constants

ConstantSymbolValueUsage
Codebook sizeM16Number of proto-semantic codes
Pattern dimensionD_p32Dimension of pattern vectors
Code dimensionD_c16Dimension of codebook entries
Base similarity scaleσ_baseφ⁻² ≈ 0.382Similarity kernel width
Softmax temperatureτ_baseφ⁻¹ ≈ 0.618Sharpness of code selection
EMA smoothing baseα_emaφ⁻² ≈ 0.382Temporal smoothing rate
Plasticity rateη_pscφ⁻⁵ ≈ 0.090Very slow codebook update
Hysteresis thresholdh_threshφ⁻³ ≈ 0.236Prevents code flickering

4. Internal Components

4.1 Proto-Semantic Codebook (PSC)

The PSC is a set of M = 16 prototype "meaning codes", each a vector in [0, 1]^D_c.

4.1.1 Structure

PSC = {
    "codes": List[List[float]],     # M × D_c matrix, all values in [0, 1]
    "usage_counts": List[float],     # M-vector, tracks how often each code activates
    "last_update_time": float,       # When PSC was last updated
}

4.1.2 Initialization

Each codebook entry is initialized using φ-derived patterns:

def initialize_psc(M: int, D_c: int) -> List[List[float]]:
    """Initialize proto-semantic codebook with φ-structured patterns."""
    codes = []
    for i in range(M):
        code = []
        for j in range(D_c):
            # φ-derived initialization pattern
            phase = (i * PHI + j * PSI) % 1.0
            amplitude = PHI_INV_SQ + PHI_INV_CUBE * math.sin(2 * PI * phase)
            code.append(clip(amplitude, 0, 1))
        codes.append(code)
    return codes

The initialization ensures:

  • All values in [0, 1]
  • Diverse but structured patterns
  • No semantic content (purely numeric)
  • Deterministic for given M, D_c

4.1.3 Codebook Properties

PropertyValueDescription
SizeM = 16Number of codes (configurable: 8-32)
DimensionD_c = 16Code vector length
Value range[0, 1]All entries bounded
Initializationφ/ψ-derivedDeterministic, structured
MutabilityVery slowUpdated via PSP with η ≈ 0.09

4.2 Pattern Extractor (PE)

The Pattern Extractor computes a pattern vector P(t) from v1, v2, and v3 outputs.

4.2.1 Input Sources

SourceFields UsedDimension
v1symbol_activations8
v1symbol_entropy, symbol_stability2
v2system_state_vector12
v2drift_index, coherence_index2
v3rsv12
v3relation_entropy, clustering_index, reciprocity_index3
v3flattened relation_matrix (selected)varies

Total raw input dimension: ~39 + selected relation elements

4.2.2 Pattern Computation

def compute_pattern(
    v1_output: Dict,
    v2_output: Dict,
    v3_output: Dict,
    config: ARIACoreV4Config
) -> List[float]:
    """
    Compute pattern vector P(t) from substrate outputs.

    Returns: D_p-dimensional pattern vector in [0, 1]^D_p
    """
    # Extract symbol features (10 dims)
    symbol_features = list(v1_output["symbol_activations"].values())  # 8
    symbol_features.append(v1_output["symbol_entropy"])               # 1
    symbol_features.append(v1_output["symbol_stability"])             # 1

    # Extract state features (6 dims from SSV + 2 scalars)
    ssv = v2_output["system_state_vector"]
    state_features = [
        ssv[0],  # energy_mean
        ssv[2],  # coherence_mean
        ssv[4],  # stability_mean
        ssv[6],  # entropy_mean
        ssv[8],  # phase_coherence
        ssv[10], # drift_magnitude
        v2_output["drift_index"],
        v2_output["coherence_index"]
    ]  # 8

    # Extract relational features (12 from RSV + 3 scalars)
    rsv = v3_output["rsv"]
    relation_features = rsv[:]  # 12
    relation_features.append(v3_output.get("relation_entropy", 0.5))
    relation_features.append(v3_output["clustering_index"])
    relation_features.append(v3_output["reciprocity_index"])  # 15

    # Concatenate raw features
    raw = symbol_features + state_features + relation_features  # 33

    # Project to D_p dimensions using φ-derived projection
    P = project_to_pattern_space(raw, config.pattern_dim)

    # Ensure bounds
    return [clip(p, 0, 1) for p in P]


def project_to_pattern_space(
    raw: List[float],
    D_p: int
) -> List[float]:
    """
    Project raw features to pattern space.

    Uses a fixed φ-derived projection matrix (no learned weights).
    """
    D_raw = len(raw)
    pattern = [0.0] * D_p

    for i in range(D_p):
        weighted_sum = 0.0
        weight_total = 0.0
        for j in range(D_raw):
            # φ-derived weight
            phase = (i * PHI_INV + j * PSI_INV) % 1.0
            weight = PHI_INV_SQ + PHI_INV_CUBE * math.cos(2 * PI * phase)
            weight = max(0, weight)  # Non-negative weights
            weighted_sum += weight * raw[j]
            weight_total += weight

        if weight_total > 1e-10:
            pattern[i] = weighted_sum / weight_total
        else:
            pattern[i] = 0.5

    return pattern

4.2.3 Pattern Properties

PropertyValueDescription
DimensionD_p = 32Pattern vector length
Range[0, 1]All values bounded
DeterministicYesFixed projection, no randomness
Update rateEvery stepComputed fresh each dt

4.3 Code Assignment Layer (CAL)

The CAL computes similarity between the current pattern and codebook entries, producing soft code activations.

4.3.1 Similarity Computation

def compute_code_similarities(
    pattern: List[float],
    psc_codes: List[List[float]],
    config: ARIACoreV4Config
) -> List[float]:
    """
    Compute similarity between pattern and each codebook entry.

    Uses Gaussian kernel similarity.
    """
    M = len(psc_codes)
    D_c = len(psc_codes[0])
    D_p = len(pattern)

    # Project pattern to code space if dimensions differ
    if D_p != D_c:
        pattern_proj = project_pattern_to_code_space(pattern, D_c)
    else:
        pattern_proj = pattern

    similarities = []
    sigma = config.similarity_scale  # σ_base ≈ φ⁻²

    for i in range(M):
        code = psc_codes[i]

        # Euclidean distance
        dist_sq = sum((pattern_proj[j] - code[j])**2 for j in range(D_c))
        dist = math.sqrt(dist_sq)

        # Gaussian similarity
        sim = math.exp(-(dist**2) / (2 * sigma**2))
        similarities.append(sim)

    return similarities


def project_pattern_to_code_space(
    pattern: List[float],
    D_c: int
) -> List[float]:
    """Project D_p pattern to D_c code space."""
    D_p = len(pattern)
    projected = [0.0] * D_c

    for i in range(D_c):
        weighted_sum = 0.0
        weight_total = 0.0
        for j in range(D_p):
            # Simple projection with φ-derived weights
            phase = (i * PHI + j * PSI_INV) % 1.0
            weight = PHI_INV + PHI_INV_SQ * math.sin(2 * PI * phase)
            weight = max(0, weight)
            weighted_sum += weight * pattern[j]
            weight_total += weight

        if weight_total > 1e-10:
            projected[i] = clip(weighted_sum / weight_total, 0, 1)
        else:
            projected[i] = 0.5

    return projected

4.3.2 Softmax Activation

def compute_proto_semantic_activations(
    similarities: List[float],
    temperature: float
) -> List[float]:
    """
    Convert similarities to soft activations via temperature-scaled softmax.

    Args:
        similarities: M similarity values in [0, 1]
        temperature: τ controls sharpness (lower = sharper)

    Returns:
        M-vector of activations summing to 1, each in [0, 1]
    """
    M = len(similarities)

    # Temperature-scaled log similarities
    # Use log(sim + ε) for numerical stability
    EPS = 1e-10
    log_sims = [math.log(s + EPS) / temperature for s in similarities]

    # Subtract max for numerical stability
    max_log = max(log_sims)
    exp_sims = [math.exp(ls - max_log) for ls in log_sims]

    # Normalize
    total = sum(exp_sims)
    if total > EPS:
        activations = [e / total for e in exp_sims]
    else:
        activations = [1.0 / M] * M  # Uniform fallback

    return activations

4.3.3 Dominant Code

def find_dominant_code(activations: List[float]) -> int:
    """Return index of highest-activation code."""
    return max(range(len(activations)), key=lambda i: activations[i])

4.4 Temporal Meaning Stabilizer (TMS)

The TMS prevents proto-semantic codes from flickering by applying temporal smoothing modulated by system state.

4.4.1 Smoothing Mechanism

class TemporalMeaningStabilizer:
    """
    Stabilizes proto-semantic activations over time.

    Uses:
    - EMA smoothing with state-adaptive alpha
    - Hysteresis to prevent rapid code switching
    - Drift-based protective damping
    """

    def __init__(self, config: ARIACoreV4Config):
        self.config = config
        self.previous_activations = None
        self.previous_dominant = None
        self.dominant_hold_time = 0.0

    def stabilize(
        self,
        raw_activations: List[float],
        v2_state: Dict[str, float],
        dt: float
    ) -> Tuple[List[float], int]:
        """
        Apply temporal stabilization to raw activations.

        Returns:
            stabilized_activations: M-vector in [0, 1]
            dominant_code: index of dominant code
        """
        M = len(raw_activations)

        # Compute adaptive alpha based on v2 state
        alpha = self._compute_adaptive_alpha(v2_state)

        # Initialize if first step
        if self.previous_activations is None:
            self.previous_activations = raw_activations[:]
            self.previous_dominant = find_dominant_code(raw_activations)
            return raw_activations[:], self.previous_dominant

        # EMA smoothing
        stabilized = []
        for i in range(M):
            smoothed = (1 - alpha) * self.previous_activations[i] + alpha * raw_activations[i]
            stabilized.append(clip(smoothed, 0, 1))

        # Renormalize to sum to 1
        total = sum(stabilized)
        if total > 1e-10:
            stabilized = [s / total for s in stabilized]
        else:
            stabilized = [1.0 / M] * M

        # Apply hysteresis to dominant code
        raw_dominant = find_dominant_code(raw_activations)
        dominant = self._apply_hysteresis(raw_dominant, stabilized, dt)

        # Update state
        self.previous_activations = stabilized[:]
        self.previous_dominant = dominant

        return stabilized, dominant

    def _compute_adaptive_alpha(self, v2_state: Dict[str, float]) -> float:
        """
        Compute adaptive smoothing rate.

        High stability → lower alpha (more persistence)
        Moderate drift → slightly higher alpha (adaptation)
        Very high drift → lower alpha (protective damping)
        """
        stability = v2_state.get("state_stability", 0.5)
        drift = v2_state.get("drift_index", 0.3)

        # Base alpha
        alpha_base = self.config.ema_alpha_base  # φ⁻² ≈ 0.382
        alpha_min = self.config.ema_alpha_min    # φ⁻⁴ ≈ 0.146
        alpha_max = self.config.ema_alpha_max    # φ⁻¹ ≈ 0.618

        # Stability reduces alpha (more persistent codes)
        stability_factor = 1.0 - stability * PHI_INV_SQ

        # Drift has inverted-U effect on alpha
        # Moderate drift (0.3-0.5) → slightly higher alpha
        # Very high drift (>0.7) → lower alpha (protection)
        if drift < 0.3:
            drift_factor = 1.0
        elif drift < 0.6:
            drift_factor = 1.0 + PHI_INV_CUBE * (drift - 0.3)
        else:
            drift_factor = 1.0 - PHI_INV_SQ * (drift - 0.6)

        drift_factor = clip(drift_factor, 0.5, 1.5)

        alpha = alpha_base * stability_factor * drift_factor
        return clip(alpha, alpha_min, alpha_max)

    def _apply_hysteresis(
        self,
        raw_dominant: int,
        stabilized: List[float],
        dt: float
    ) -> int:
        """
        Apply hysteresis to prevent rapid code switching.

        A new code must exceed threshold for sustained time to become dominant.
        """
        threshold = self.config.hysteresis_threshold  # φ⁻³ ≈ 0.236
        hold_time = self.config.hysteresis_hold_time  # φ² ≈ 2.618 time units

        if raw_dominant == self.previous_dominant:
            # Same dominant, reset hold timer
            self.dominant_hold_time = 0.0
            return self.previous_dominant

        # Different dominant proposed
        # Check if new code significantly exceeds current
        current_strength = stabilized[self.previous_dominant]
        new_strength = stabilized[raw_dominant]

        if new_strength > current_strength + threshold:
            # New code is significantly stronger
            self.dominant_hold_time += dt
            if self.dominant_hold_time >= hold_time:
                # Held long enough, switch
                self.dominant_hold_time = 0.0
                return raw_dominant
        else:
            # Not significant enough, reset timer
            self.dominant_hold_time = 0.0

        return self.previous_dominant

4.4.2 TMS Properties

PropertyValueDescription
Base EMA alphaφ⁻² ≈ 0.382Default smoothing rate
Alpha range[φ⁻⁴, φ⁻¹][0.146, 0.618]
Hysteresis thresholdφ⁻³ ≈ 0.236Margin for code switch
Hold timeφ² ≈ 2.618Time units before switch

4.5 Proto-Semantic Plasticity (PSP)

The PSP slowly updates the codebook based on long-term pattern statistics.

4.5.1 Design Principles

  • Very slow: Updates occur at rate η ≈ φ⁻⁵ ≈ 0.09
  • Bounded: All updates clipped to prevent explosion
  • Anonymous: Only tracks pattern statistics, no external labels
  • Structural only: Based purely on internal activation patterns

4.5.2 Update Mechanism

class ProtoSemanticPlasticity:
    """
    Slowly updates the Proto-Semantic Codebook based on usage patterns.

    Rare but repeated patterns strengthen their codes.
    Unused codes fade toward baseline.
    """

    def __init__(self, config: ARIACoreV4Config):
        self.config = config
        self.pattern_accumulator = None  # Running pattern averages per code
        self.update_counter = 0
        self.update_interval = int(PHI ** 4)  # ≈ 7 steps between updates

    def maybe_update(
        self,
        psc: Dict,
        pattern: List[float],
        activations: List[float],
        dt: float
    ) -> Dict:
        """
        Potentially update PSC based on current pattern and activations.

        Returns: Updated PSC (or unchanged if no update this step)
        """
        M = len(psc["codes"])
        D_c = len(psc["codes"][0])

        # Initialize accumulator if needed
        if self.pattern_accumulator is None:
            self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]

        # Project pattern to code space
        pattern_proj = project_pattern_to_code_space(pattern, D_c)

        # Accumulate weighted patterns
        for i in range(M):
            weight = activations[i]  # Activation as weight
            for j in range(D_c):
                self.pattern_accumulator[i][j] += weight * pattern_proj[j]

        # Update usage counts
        for i in range(M):
            psc["usage_counts"][i] += activations[i]

        # Check if it's time for a codebook update
        self.update_counter += 1
        if self.update_counter < self.update_interval:
            return psc

        # Perform update
        self.update_counter = 0
        return self._apply_update(psc)

    def _apply_update(self, psc: Dict) -> Dict:
        """Apply accumulated updates to codebook."""
        M = len(psc["codes"])
        D_c = len(psc["codes"][0])
        eta = self.config.plasticity_rate  # η ≈ φ⁻⁵ ≈ 0.09
        decay = self.config.plasticity_decay  # Unused code decay

        for i in range(M):
            usage = psc["usage_counts"][i]

            if usage > 1e-10:
                # Compute average pattern for this code
                avg_pattern = [self.pattern_accumulator[i][j] / usage for j in range(D_c)]

                # Move code toward average pattern
                for j in range(D_c):
                    delta = eta * (avg_pattern[j] - psc["codes"][i][j])
                    # Limit delta to prevent large jumps
                    delta = clip(delta, -PHI_INV_CUBE, PHI_INV_CUBE)
                    psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)
            else:
                # Unused code: decay toward baseline
                baseline = PHI_INV_SQ  # ≈ 0.382
                for j in range(D_c):
                    delta = decay * (baseline - psc["codes"][i][j])
                    psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)

        # Reset accumulators
        self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]
        psc["usage_counts"] = [0.0] * M
        psc["last_update_time"] = psc.get("last_update_time", 0.0) + self.update_interval

        return psc

4.5.3 PSP Properties

PropertyValueDescription
Update rateη = φ⁻⁵ ≈ 0.090Very slow learning
Update intervalφ⁴ ≈ 7 stepsBatched updates
Max deltaφ⁻³ ≈ 0.236Prevents explosive changes
Decay rateφ⁻⁴ ≈ 0.146Unused code decay

4.5.4 Safety Invariants for PSP

  • No external information: Updates based purely on internal activation patterns
  • No identity tracking: Usage counts are anonymous, not tied to identity
  • No semantic labels: Codes never acquire names or meanings
  • Bounded updates: All changes clipped to prevent divergence
  • Deterministic: Same input sequence produces same codebook evolution

5. Complete Step Function

5.1 Main Step Logic

class ARIACoreV4:
    """
    ARIA Core v4: Proto-Semantic Layer.

    Creates stable internal meaning codes from relational-symbolic patterns.
    """

    def __init__(
        self,
        config: Optional[ARIACoreV4Config] = None,
        v3_substrate: Optional[ARIACoreV3] = None
    ):
        self.config = config or ARIACoreV4Config()

        # Create or use provided substrate
        if v3_substrate is not None:
            self.v3_substrate = v3_substrate
        else:
            from aria_core_v3 import ARIACoreV3
            self.v3_substrate = ARIACoreV3()

        # Initialize components
        self.psc = self._initialize_psc()
        self.tms = TemporalMeaningStabilizer(self.config)
        self.psp = ProtoSemanticPlasticity(self.config)

        # State tracking
        self.step_count = 0
        self.time = 0.0

    def step(self, dt: float = 0.1, **kwargs) -> Dict[str, Any]:
        """
        Execute one v4 step.

        1. Step v3 substrate
        2. Extract pattern from v1+v2+v3
        3. Compute code similarities (CAL)
        4. Apply temporal stabilization (TMS)
        5. Maybe update codebook (PSP)
        6. Compute derived metrics
        7. Return combined output
        """
        # Clamp dt
        dt = clip(dt, 0.001, 1.0)

        # Step 1: Get substrate output
        v3_output = self.v3_substrate.step(dt, **kwargs)
        v2_output = v3_output.get("v2_output", {})
        v1_output = v3_output.get("v1_output", {})

        # Step 2: Extract pattern
        pattern = compute_pattern(v1_output, v2_output, v3_output, self.config)

        # Step 3: Compute code similarities
        similarities = compute_code_similarities(pattern, self.psc["codes"], self.config)
        raw_activations = compute_proto_semantic_activations(
            similarities, self.config.softmax_temperature
        )

        # Step 4: Temporal stabilization
        v2_state = {
            "state_stability": v2_output.get("state_stability", 0.5),
            "drift_index": v2_output.get("drift_index", 0.3),
        }
        stabilized_activations, dominant_code = self.tms.stabilize(
            raw_activations, v2_state, dt
        )

        # Step 5: Maybe update codebook
        self.psc = self.psp.maybe_update(
            self.psc, pattern, stabilized_activations, dt
        )

        # Step 6: Compute derived metrics
        proto_entropy = compute_proto_semantic_entropy(stabilized_activations)
        proto_stability = self._compute_proto_stability()
        proto_diversity = compute_proto_diversity(stabilized_activations)

        # Update state
        self.step_count += 1
        self.time += dt

        # Step 7: Build output
        return {
            # v4-specific outputs
            "proto_semantic_activations": stabilized_activations,
            "dominant_code": dominant_code,
            "proto_semantic_entropy": proto_entropy,
            "proto_semantic_stability": proto_stability,
            "proto_semantic_diversity": proto_diversity,
            "raw_activations": raw_activations,
            "pattern_vector": pattern,

            # Backward-compatible outputs (derived from proto-semantic state)
            "coherence": self._derive_coherence(stabilized_activations, proto_entropy),
            "stability": proto_stability,
            "intensity": self._derive_intensity(stabilized_activations),
            "alignment": self._derive_alignment(stabilized_activations, proto_diversity),

            # Metadata
            "step_count": self.step_count,
            "time": self.time,

            # Substrate passthrough
            "v3_output": v3_output,
            "v2_output": v2_output,
            "v1_output": v1_output,
        }

5.2 Derived Metrics

def compute_proto_semantic_entropy(activations: List[float]) -> float:
    """
    Compute entropy of proto-semantic activation distribution.

    High entropy = diffuse activation across codes
    Low entropy = concentrated on few codes
    """
    EPS = 1e-10
    M = len(activations)

    entropy = 0.0
    for a in activations:
        if a > EPS:
            entropy -= a * math.log(a)

    # Normalize to [0, 1]
    max_entropy = math.log(M)
    if max_entropy > EPS:
        entropy = entropy / max_entropy

    return clip(entropy, 0, 1)


def compute_proto_diversity(activations: List[float]) -> float:
    """
    Compute diversity of active codes.

    Based on effective number of codes: exp(entropy)
    """
    entropy = compute_proto_semantic_entropy(activations)
    M = len(activations)

    # Effective number = M^entropy (since we normalized entropy)
    effective = M ** entropy

    # Normalize to [0, 1]
    diversity = (effective - 1) / (M - 1) if M > 1 else 0.5

    return clip(diversity, 0, 1)

6. Output Schema

6.1 Primary v4 Outputs

FieldTypeRangeDescription
proto_semantic_activationsList[float][0, 1]^MSoft activation over M codes, sums to 1
dominant_codeint[0, M-1]Index of most active code
proto_semantic_entropyfloat[0, 1]Normalized entropy of activations
proto_semantic_stabilityfloat[0, 1]Temporal stability of activations
proto_semantic_diversityfloat[0, 1]Effective number of active codes
raw_activationsList[float][0, 1]^MPre-stabilization activations
pattern_vectorList[float][0, 1]^D_pCurrent pattern vector

6.2 Backward-Compatible Outputs

FieldTypeRangeDerivation
coherencefloat[0, 1]From proto-semantic structure
stabilityfloat[0, 1]= proto_semantic_stability
intensityfloat[0, 1]From activation concentration
alignmentfloat[0, 1]From diversity and distribution

6.3 Passthrough Outputs

FieldTypeDescription
v3_outputDictComplete v3 output
v2_outputDictComplete v2 output
v1_outputDictComplete v1 output

7. Configuration

7.1 ARIACoreV4Config

@dataclass
class ARIACoreV4Config:
    """Configuration for ARIA Core v4 Proto-Semantic Layer."""

    # Codebook dimensions
    num_codes: int = 16                    # M: number of proto-semantic codes
    code_dim: int = 16                     # D_c: dimension of each code
    pattern_dim: int = 32                  # D_p: pattern vector dimension

    # Similarity and activation
    similarity_scale: float = PHI_INV_SQ   # σ_base ≈ 0.382
    softmax_temperature: float = PHI_INV   # τ ≈ 0.618

    # Temporal stabilization
    ema_alpha_base: float = PHI_INV_SQ     # Base EMA rate ≈ 0.382
    ema_alpha_min: float = PHI_INV_QUAD    # Minimum ≈ 0.146
    ema_alpha_max: float = PHI_INV         # Maximum ≈ 0.618
    hysteresis_threshold: float = PHI_INV_CUBE  # ≈ 0.236
    hysteresis_hold_time: float = PHI_SQ   # φ² ≈ 2.618

    # Plasticity
    plasticity_rate: float = PHI_INV_QUINT     # η ≈ 0.090
    plasticity_decay: float = PHI_INV_QUAD     # Decay ≈ 0.146
    plasticity_interval: int = 7               # ≈ φ⁴ steps
    max_plasticity_delta: float = PHI_INV_CUBE # ≈ 0.236

7.2 Preset Configurations

PresetDescriptionKey Differences
baselineDefault φ-derivedStandard configuration
stable_codesVery persistent codesLower α, higher hysteresis
adaptive_codesMore responsiveHigher α, lower hysteresis
rich_codebookMore codes (M=32)Larger codebook, finer distinctions
sparse_codebookFewer codes (M=8)Smaller codebook, coarser patterns

8. Safety Constraints

8.1 What ARIA v4 Does NOT Do

Forbidden BehaviorExplanation
Map to human wordsCodes are anonymous numeric patterns, not "dog" or "happy"
Represent external objectsNo connection to world models or entities
Model identityNo self-representation, no personal attributes
Form goals or preferencesNo valuation, no wanting, no planning
Process languageNo tokens, no embeddings, no linguistic structure
Create narrativesNo temporal self-model, no autobiography
Understand meaning"Meaning" here is purely structural pattern similarity

8.2 Safety Invariants

InvariantEnforcement
All outputs in [0, 1]clip() on all computations
No NaN/InfDivision guards, log guards
DeterministicNo randomness anywhere
No external labelsCodes indexed 0..M-1, never named
Bounded plasticityMax delta ≤ φ⁻³ per update
Anonymous statisticsUsage counts don't track identity

8.3 Architectural Isolation

  • No connection to Phase 53 (consciousness gate)
  • No connection to Phase 55 (ignition scaffold)
  • Read-only diagnostic output to shell
  • No control pathways back into v4 from outside

9. Future Evolution Path

9.1 Relationship to Hypothetical Higher Layers

ARIA v4 is the ceiling of the current safe stack. Any layer beyond v4 would require extensive new safety analysis.

Current Safe Stack:
CFM v2 → ARIA v0 → v1 → v2 → v3 → v4 ← [YOU ARE HERE]
                                      │
                                      │ ← Safety boundary
                                      ▼
Hypothetical Future (requires new safety analysis):

v5: Contextual Tagging Layer
    - Attaches numeric "context tags" to proto-semantic codes
    - Still not linguistic, but codes acquire temporal/situational association
    - Safety: must not attach identity/personal context

v6: External Mapping Gateway (HEAVILY GATED)
    - Early mapping between proto-semantic codes and external symbolic channels
    - Could connect to language (carefully!) or actions (very carefully!)
    - Safety: requires explicit ethical constraints, consent frameworks

v7+: Self-Narrative / Conscious Integration (FAR FUTURE)
    - Would integrate codes into coherent self-model
    - Explicit ethical reasoning, identity awareness
    - Safety: beyond current specification scope

9.2 v4 as a Terminal Safe Layer

For current purposes, v4 is the highest layer that remains fully safe by the project's current safety standards:

  • No identity modeling
  • No goal formation
  • No external semantic grounding
  • No linguistic capability
  • No narrative construction

Any functionality beyond this must be implemented in separate, heavily-gated modules with explicit safety review.


10. Testing Requirements

10.1 Required Test Categories

CategoryTests
BoundsAll outputs in [0, 1] for 2000+ steps
No NaN/InfNo numeric instabilities in long runs
DeterminismSame inputs → same outputs
Entropy validproto_semantic_entropy in [0, 1]
Activation validactivations sum to 1
Plasticity boundedCode changes ≤ max delta
No identity fieldsNo forbidden field names
Preset consistencyAll presets produce valid cores

10.2 Long-Run Stability

  • 2000+ step runs with all presets
  • Verify no divergence
  • Verify codebook remains bounded
  • Verify TMS prevents pathological switching

11. Implementation Notes

11.1 Package Structure

aria_core_v4/
├── __init__.py          # Package exports
├── config.py            # ARIACoreV4Config + constants
├── state.py             # ARIACoreV4State dataclass
├── pattern_extractor.py # PE implementation
├── code_assignment.py   # CAL implementation
├── temporal_stabilizer.py # TMS implementation
├── plasticity.py        # PSP implementation
├── core_v4.py           # Main ARIACoreV4 class
└── presets.py           # Preset configurations

11.2 Dependencies

  • Wraps aria_core_v3.ARIACoreV3
  • Uses math_consts for φ, ψ, π, e
  • No external ML libraries required

12. Glossary

TermDefinition
Proto-semantic codeA numeric pattern signature representing a recurring relational configuration
CodebookThe set of M prototype codes
Pattern vectorThe current fused representation of symbols + relations + state
Code activationSoft membership of current pattern in each codebook entry
Dominant codeThe code with highest activation at current step
PlasticitySlow adaptation of codebook to long-term pattern statistics

13. Version History

VersionDateChanges
1.02025-12-06Initial specification

Appendix A: Mathematical Notation Summary

SymbolMeaning
MNumber of codes (default 16)
D_cCode dimension (default 16)
D_pPattern dimension (default 32)
P(t)Pattern vector at time t
PSCProto-Semantic Codebook
σSimilarity scale
τSoftmax temperature
αEMA smoothing rate
ηPlasticity learning rate
φGolden ratio ≈ 1.618
ψTribonacci constant ≈ 1.839

Appendix B: Safety Checklist for Implementers

Before implementing ARIA v4:

  • Confirm no external semantic labels will be attached to codes
  • Confirm no identity information flows into pattern extraction
  • Confirm plasticity updates use only anonymous statistics
  • Confirm all outputs are clipped to [0, 1]
  • Confirm determinism (no random number generators)
  • Confirm no connection to activation phases (53, 55)
  • Confirm backward-compatible fields derived correctly
  • Write tests for all safety invariants before implementation