ARIA v4 Specification

Proto-semantic layer: pattern extraction, 16-code codebook, temporal stabilization, very slow plasticity.

February 2026PDF DOCX

ARIA Core v4 Specification: Proto-Semantic Layer

Version: 1.0 Status: Design Specification (Not Yet Implemented) Last Updated: 2025-12-06

1. Executive Summary

ARIA Core v4 is the Proto-Semantic Layer - a numeric coding system that creates stable internal "meaning bundles" by integrating symbol activations (v1), system state (v2), and relational structure (v3). It produces a fixed set of M proto-semantic codes representing recurring patterns in the underlying relational-symbolic dynamics.

Critical Clarification: ARIA Core v4 is:

Pre-semantic: These are NOT human concepts, words, or external referents
Non-linguistic: No words, sentences, tokens, or language structures
Non-conscious: No awareness, experience, understanding, or narrative
Non-agentic: No goals, plans, intentions, or decisions
Pre-narrative: No stories, no temporal self-model, no autobiographical content

ARIA Core v4 is analogous to:

A learned codebook over internal relational dynamics (like VQ-VAE codes, but simpler)
A set of "internal pattern signatures" summarizing recurring relational configurations
Compressed numeric fingerprints of anonymous symbol-relation-state bundles
Cluster centroids in a derived feature space with temporal smoothing

What v4 codes represent:

"When symbols k_2 and k_5 are co-active with strong reciprocal relations and high system stability, pattern P_7 tends to activate"
This is purely structural — P_7 has no meaning beyond its numeric signature

What v4 codes do NOT represent:

Human concepts ("dog", "happiness", "self")
External objects or entities
Personal identity or biographical content
Goals, values, or preferences
Linguistic tokens or embeddings

2. Architectural Position

2.1 Layer Hierarchy

┌─────────────────────────────────────────────────────────────────┐
│                    ARIA Core v4 (This Spec)                     │
│                     Proto-Semantic Layer                        │
│       Creates stable meaning codes from relational patterns     │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v3                              │
│                  Relational Symbolic Layer                      │
│        8×8 Symbol Relation Graph + 12D RSV                      │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v2                              │
│                   System State Layer                            │
│        12D System State Vector + Change Detection               │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v1                              │
│                   Proto-Symbolic Layer                          │
│           8 Symbol Activations + Temporal Dynamics              │
├─────────────────────────────────────────────────────────────────┤
│                       ARIA Core v0                              │
│                Proto-Conceptual Attractor Engine                │
│          5D Latent Channels + 4 Attractor Clusters              │
├─────────────────────────────────────────────────────────────────┤
│                       CFM Core v2                               │
│                Multi-Channel Field Substrate                    │
│              5 Coupled Oscillator Channels                      │
└─────────────────────────────────────────────────────────────────┘

2.2 Data Flow

v1 Outputs ─────────────────────────────┐
  symbol_activations[8]                 │
  symbol_entropy                        │
  symbol_stability                      │
  dominant_symbol                       │
                                        │
v2 Outputs ─────────────────────────────┼──► Pattern      ──► Code         ──► Proto-Semantic
  system_state_vector[12]               │    Extractor       Assignment       Activations[M]
  state_stability                       │    (PE)            Layer (CAL)
  drift_index                           │                         │
  coherence_index                       │                         │
                                        │                         ▼
v3 Outputs ─────────────────────────────┤              Temporal Meaning
  relation_matrix[8×8]                  │              Stabilizer (TMS)
  rsv[12]                               │                         │
  relation_entropy                      │                         ▼
  clustering_index                      │              Proto-Semantic
  reciprocity_index                     │              Plasticity (PSP)
                                        │                    │
                                        │                    ▼
Proto-Semantic Codebook (PSC) ◄─────────┴──────────── [slow update]

2.3 Wrapping Pattern

ARIA Core v4 wraps ARIA Core v3 (which wraps v2, v1, v0, CFM v2):

v4.step(dt) → v3.step(dt) → v2.step(dt) → v1.step(dt) → v0.step(dt) → cfm.step(dt)

Each layer receives the complete output from its substrate and adds its own computations.

3. Mathematical Constants

All constants are derived from φ (golden ratio), ψ (tribonacci constant), e (Euler's number), and π.

3.1 Primary Constants

Constant	Symbol	Value	Derivation
Golden Ratio	φ	1.6180339887	(1 + √5) / 2
Inverse Golden	φ⁻¹	0.6180339887	1 / φ
φ-squared inverse	φ⁻²	0.3819660113	1 / φ²
φ-cubed inverse	φ⁻³	0.2360679775	1 / φ³
φ-fourth inverse	φ⁻⁴	0.1458980338	1 / φ⁴
φ-fifth inverse	φ⁻⁵	0.0901699437	1 / φ⁵
Tribonacci	ψ	1.8392867552	Real root of x³ = x² + x + 1
Inverse Tribonacci	ψ⁻¹	0.5436890127	1 / ψ
Euler's number	e	2.7182818285	lim(1 + 1/n)ⁿ
Inverse e	e⁻¹	0.3678794412	1 / e
Pi	π	3.1415926536	Circle ratio

3.2 v4-Specific Constants

Constant	Symbol	Value	Usage
Codebook size	M	16	Number of proto-semantic codes
Pattern dimension	D_p	32	Dimension of pattern vectors
Code dimension	D_c	16	Dimension of codebook entries
Base similarity scale	σ_base	φ⁻² ≈ 0.382	Similarity kernel width
Softmax temperature	τ_base	φ⁻¹ ≈ 0.618	Sharpness of code selection
EMA smoothing base	α_ema	φ⁻² ≈ 0.382	Temporal smoothing rate
Plasticity rate	η_psc	φ⁻⁵ ≈ 0.090	Very slow codebook update
Hysteresis threshold	h_thresh	φ⁻³ ≈ 0.236	Prevents code flickering

4. Internal Components

4.1 Proto-Semantic Codebook (PSC)

The PSC is a set of M = 16 prototype "meaning codes", each a vector in [0, 1]^D_c.

4.1.1 Structure

PSC = {
    "codes": List[List[float]],     # M × D_c matrix, all values in [0, 1]
    "usage_counts": List[float],     # M-vector, tracks how often each code activates
    "last_update_time": float,       # When PSC was last updated
}

4.1.2 Initialization

Each codebook entry is initialized using φ-derived patterns:

def initialize_psc(M: int, D_c: int) -> List[List[float]]:
    """Initialize proto-semantic codebook with φ-structured patterns."""
    codes = []
    for i in range(M):
        code = []
        for j in range(D_c):
            # φ-derived initialization pattern
            phase = (i * PHI + j * PSI) % 1.0
            amplitude = PHI_INV_SQ + PHI_INV_CUBE * math.sin(2 * PI * phase)
            code.append(clip(amplitude, 0, 1))
        codes.append(code)
    return codes

The initialization ensures:

All values in [0, 1]
Diverse but structured patterns
No semantic content (purely numeric)
Deterministic for given M, D_c

4.1.3 Codebook Properties

Property	Value	Description
Size	M = 16	Number of codes (configurable: 8-32)
Dimension	D_c = 16	Code vector length
Value range	[0, 1]	All entries bounded
Initialization	φ/ψ-derived	Deterministic, structured
Mutability	Very slow	Updated via PSP with η ≈ 0.09

4.2 Pattern Extractor (PE)

The Pattern Extractor computes a pattern vector P(t) from v1, v2, and v3 outputs.

4.2.1 Input Sources

Source	Fields Used	Dimension
v1	symbol_activations	8
v1	symbol_entropy, symbol_stability	2
v2	system_state_vector	12
v2	drift_index, coherence_index	2
v3	rsv	12
v3	relation_entropy, clustering_index, reciprocity_index	3
v3	flattened relation_matrix (selected)	varies

Total raw input dimension: ~39 + selected relation elements

4.2.2 Pattern Computation

def compute_pattern(
    v1_output: Dict,
    v2_output: Dict,
    v3_output: Dict,
    config: ARIACoreV4Config
) -> List[float]:
    """
    Compute pattern vector P(t) from substrate outputs.

    Returns: D_p-dimensional pattern vector in [0, 1]^D_p
    """
    # Extract symbol features (10 dims)
    symbol_features = list(v1_output["symbol_activations"].values())  # 8
    symbol_features.append(v1_output["symbol_entropy"])               # 1
    symbol_features.append(v1_output["symbol_stability"])             # 1

    # Extract state features (6 dims from SSV + 2 scalars)
    ssv = v2_output["system_state_vector"]
    state_features = [
        ssv[0],  # energy_mean
        ssv[2],  # coherence_mean
        ssv[4],  # stability_mean
        ssv[6],  # entropy_mean
        ssv[8],  # phase_coherence
        ssv[10], # drift_magnitude
        v2_output["drift_index"],
        v2_output["coherence_index"]
    ]  # 8

    # Extract relational features (12 from RSV + 3 scalars)
    rsv = v3_output["rsv"]
    relation_features = rsv[:]  # 12
    relation_features.append(v3_output.get("relation_entropy", 0.5))
    relation_features.append(v3_output["clustering_index"])
    relation_features.append(v3_output["reciprocity_index"])  # 15

    # Concatenate raw features
    raw = symbol_features + state_features + relation_features  # 33

    # Project to D_p dimensions using φ-derived projection
    P = project_to_pattern_space(raw, config.pattern_dim)

    # Ensure bounds
    return [clip(p, 0, 1) for p in P]


def project_to_pattern_space(
    raw: List[float],
    D_p: int
) -> List[float]:
    """
    Project raw features to pattern space.

    Uses a fixed φ-derived projection matrix (no learned weights).
    """
    D_raw = len(raw)
    pattern = [0.0] * D_p

    for i in range(D_p):
        weighted_sum = 0.0
        weight_total = 0.0
        for j in range(D_raw):
            # φ-derived weight
            phase = (i * PHI_INV + j * PSI_INV) % 1.0
            weight = PHI_INV_SQ + PHI_INV_CUBE * math.cos(2 * PI * phase)
            weight = max(0, weight)  # Non-negative weights
            weighted_sum += weight * raw[j]
            weight_total += weight

        if weight_total > 1e-10:
            pattern[i] = weighted_sum / weight_total
        else:
            pattern[i] = 0.5

    return pattern

4.2.3 Pattern Properties

Property	Value	Description
Dimension	D_p = 32	Pattern vector length
Range	[0, 1]	All values bounded
Deterministic	Yes	Fixed projection, no randomness
Update rate	Every step	Computed fresh each dt

4.3 Code Assignment Layer (CAL)

The CAL computes similarity between the current pattern and codebook entries, producing soft code activations.

4.3.1 Similarity Computation

def compute_code_similarities(
    pattern: List[float],
    psc_codes: List[List[float]],
    config: ARIACoreV4Config
) -> List[float]:
    """
    Compute similarity between pattern and each codebook entry.

    Uses Gaussian kernel similarity.
    """
    M = len(psc_codes)
    D_c = len(psc_codes[0])
    D_p = len(pattern)

    # Project pattern to code space if dimensions differ
    if D_p != D_c:
        pattern_proj = project_pattern_to_code_space(pattern, D_c)
    else:
        pattern_proj = pattern

    similarities = []
    sigma = config.similarity_scale  # σ_base ≈ φ⁻²

    for i in range(M):
        code = psc_codes[i]

        # Euclidean distance
        dist_sq = sum((pattern_proj[j] - code[j])**2 for j in range(D_c))
        dist = math.sqrt(dist_sq)

        # Gaussian similarity
        sim = math.exp(-(dist**2) / (2 * sigma**2))
        similarities.append(sim)

    return similarities


def project_pattern_to_code_space(
    pattern: List[float],
    D_c: int
) -> List[float]:
    """Project D_p pattern to D_c code space."""
    D_p = len(pattern)
    projected = [0.0] * D_c

    for i in range(D_c):
        weighted_sum = 0.0
        weight_total = 0.0
        for j in range(D_p):
            # Simple projection with φ-derived weights
            phase = (i * PHI + j * PSI_INV) % 1.0
            weight = PHI_INV + PHI_INV_SQ * math.sin(2 * PI * phase)
            weight = max(0, weight)
            weighted_sum += weight * pattern[j]
            weight_total += weight

        if weight_total > 1e-10:
            projected[i] = clip(weighted_sum / weight_total, 0, 1)
        else:
            projected[i] = 0.5

    return projected

4.3.2 Softmax Activation

def compute_proto_semantic_activations(
    similarities: List[float],
    temperature: float
) -> List[float]:
    """
    Convert similarities to soft activations via temperature-scaled softmax.

    Args:
        similarities: M similarity values in [0, 1]
        temperature: τ controls sharpness (lower = sharper)

    Returns:
        M-vector of activations summing to 1, each in [0, 1]
    """
    M = len(similarities)

    # Temperature-scaled log similarities
    # Use log(sim + ε) for numerical stability
    EPS = 1e-10
    log_sims = [math.log(s + EPS) / temperature for s in similarities]

    # Subtract max for numerical stability
    max_log = max(log_sims)
    exp_sims = [math.exp(ls - max_log) for ls in log_sims]

    # Normalize
    total = sum(exp_sims)
    if total > EPS:
        activations = [e / total for e in exp_sims]
    else:
        activations = [1.0 / M] * M  # Uniform fallback

    return activations

4.3.3 Dominant Code

def find_dominant_code(activations: List[float]) -> int:
    """Return index of highest-activation code."""
    return max(range(len(activations)), key=lambda i: activations[i])

4.4 Temporal Meaning Stabilizer (TMS)

The TMS prevents proto-semantic codes from flickering by applying temporal smoothing modulated by system state.

4.4.1 Smoothing Mechanism

class TemporalMeaningStabilizer:
    """
    Stabilizes proto-semantic activations over time.

    Uses:
    - EMA smoothing with state-adaptive alpha
    - Hysteresis to prevent rapid code switching
    - Drift-based protective damping
    """

    def __init__(self, config: ARIACoreV4Config):
        self.config = config
        self.previous_activations = None
        self.previous_dominant = None
        self.dominant_hold_time = 0.0

    def stabilize(
        self,
        raw_activations: List[float],
        v2_state: Dict[str, float],
        dt: float
    ) -> Tuple[List[float], int]:
        """
        Apply temporal stabilization to raw activations.

        Returns:
            stabilized_activations: M-vector in [0, 1]
            dominant_code: index of dominant code
        """
        M = len(raw_activations)

        # Compute adaptive alpha based on v2 state
        alpha = self._compute_adaptive_alpha(v2_state)

        # Initialize if first step
        if self.previous_activations is None:
            self.previous_activations = raw_activations[:]
            self.previous_dominant = find_dominant_code(raw_activations)
            return raw_activations[:], self.previous_dominant

        # EMA smoothing
        stabilized = []
        for i in range(M):
            smoothed = (1 - alpha) * self.previous_activations[i] + alpha * raw_activations[i]
            stabilized.append(clip(smoothed, 0, 1))

        # Renormalize to sum to 1
        total = sum(stabilized)
        if total > 1e-10:
            stabilized = [s / total for s in stabilized]
        else:
            stabilized = [1.0 / M] * M

        # Apply hysteresis to dominant code
        raw_dominant = find_dominant_code(raw_activations)
        dominant = self._apply_hysteresis(raw_dominant, stabilized, dt)

        # Update state
        self.previous_activations = stabilized[:]
        self.previous_dominant = dominant

        return stabilized, dominant

    def _compute_adaptive_alpha(self, v2_state: Dict[str, float]) -> float:
        """
        Compute adaptive smoothing rate.

        High stability → lower alpha (more persistence)
        Moderate drift → slightly higher alpha (adaptation)
        Very high drift → lower alpha (protective damping)
        """
        stability = v2_state.get("state_stability", 0.5)
        drift = v2_state.get("drift_index", 0.3)

        # Base alpha
        alpha_base = self.config.ema_alpha_base  # φ⁻² ≈ 0.382
        alpha_min = self.config.ema_alpha_min    # φ⁻⁴ ≈ 0.146
        alpha_max = self.config.ema_alpha_max    # φ⁻¹ ≈ 0.618

        # Stability reduces alpha (more persistent codes)
        stability_factor = 1.0 - stability * PHI_INV_SQ

        # Drift has inverted-U effect on alpha
        # Moderate drift (0.3-0.5) → slightly higher alpha
        # Very high drift (>0.7) → lower alpha (protection)
        if drift < 0.3:
            drift_factor = 1.0
        elif drift < 0.6:
            drift_factor = 1.0 + PHI_INV_CUBE * (drift - 0.3)
        else:
            drift_factor = 1.0 - PHI_INV_SQ * (drift - 0.6)

        drift_factor = clip(drift_factor, 0.5, 1.5)

        alpha = alpha_base * stability_factor * drift_factor
        return clip(alpha, alpha_min, alpha_max)

    def _apply_hysteresis(
        self,
        raw_dominant: int,
        stabilized: List[float],
        dt: float
    ) -> int:
        """
        Apply hysteresis to prevent rapid code switching.

        A new code must exceed threshold for sustained time to become dominant.
        """
        threshold = self.config.hysteresis_threshold  # φ⁻³ ≈ 0.236
        hold_time = self.config.hysteresis_hold_time  # φ² ≈ 2.618 time units

        if raw_dominant == self.previous_dominant:
            # Same dominant, reset hold timer
            self.dominant_hold_time = 0.0
            return self.previous_dominant

        # Different dominant proposed
        # Check if new code significantly exceeds current
        current_strength = stabilized[self.previous_dominant]
        new_strength = stabilized[raw_dominant]

        if new_strength > current_strength + threshold:
            # New code is significantly stronger
            self.dominant_hold_time += dt
            if self.dominant_hold_time >= hold_time:
                # Held long enough, switch
                self.dominant_hold_time = 0.0
                return raw_dominant
        else:
            # Not significant enough, reset timer
            self.dominant_hold_time = 0.0

        return self.previous_dominant

4.4.2 TMS Properties

Property	Value	Description
Base EMA alpha	φ⁻² ≈ 0.382	Default smoothing rate
Alpha range	[φ⁻⁴, φ⁻¹]	[0.146, 0.618]
Hysteresis threshold	φ⁻³ ≈ 0.236	Margin for code switch
Hold time	φ² ≈ 2.618	Time units before switch

4.5 Proto-Semantic Plasticity (PSP)

The PSP slowly updates the codebook based on long-term pattern statistics.

4.5.1 Design Principles

Very slow: Updates occur at rate η ≈ φ⁻⁵ ≈ 0.09
Bounded: All updates clipped to prevent explosion
Anonymous: Only tracks pattern statistics, no external labels
Structural only: Based purely on internal activation patterns

4.5.2 Update Mechanism

class ProtoSemanticPlasticity:
    """
    Slowly updates the Proto-Semantic Codebook based on usage patterns.

    Rare but repeated patterns strengthen their codes.
    Unused codes fade toward baseline.
    """

    def __init__(self, config: ARIACoreV4Config):
        self.config = config
        self.pattern_accumulator = None  # Running pattern averages per code
        self.update_counter = 0
        self.update_interval = int(PHI ** 4)  # ≈ 7 steps between updates

    def maybe_update(
        self,
        psc: Dict,
        pattern: List[float],
        activations: List[float],
        dt: float
    ) -> Dict:
        """
        Potentially update PSC based on current pattern and activations.

        Returns: Updated PSC (or unchanged if no update this step)
        """
        M = len(psc["codes"])
        D_c = len(psc["codes"][0])

        # Initialize accumulator if needed
        if self.pattern_accumulator is None:
            self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]

        # Project pattern to code space
        pattern_proj = project_pattern_to_code_space(pattern, D_c)

        # Accumulate weighted patterns
        for i in range(M):
            weight = activations[i]  # Activation as weight
            for j in range(D_c):
                self.pattern_accumulator[i][j] += weight * pattern_proj[j]

        # Update usage counts
        for i in range(M):
            psc["usage_counts"][i] += activations[i]

        # Check if it's time for a codebook update
        self.update_counter += 1
        if self.update_counter < self.update_interval:
            return psc

        # Perform update
        self.update_counter = 0
        return self._apply_update(psc)

    def _apply_update(self, psc: Dict) -> Dict:
        """Apply accumulated updates to codebook."""
        M = len(psc["codes"])
        D_c = len(psc["codes"][0])
        eta = self.config.plasticity_rate  # η ≈ φ⁻⁵ ≈ 0.09
        decay = self.config.plasticity_decay  # Unused code decay

        for i in range(M):
            usage = psc["usage_counts"][i]

            if usage > 1e-10:
                # Compute average pattern for this code
                avg_pattern = [self.pattern_accumulator[i][j] / usage for j in range(D_c)]

                # Move code toward average pattern
                for j in range(D_c):
                    delta = eta * (avg_pattern[j] - psc["codes"][i][j])
                    # Limit delta to prevent large jumps
                    delta = clip(delta, -PHI_INV_CUBE, PHI_INV_CUBE)
                    psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)
            else:
                # Unused code: decay toward baseline
                baseline = PHI_INV_SQ  # ≈ 0.382
                for j in range(D_c):
                    delta = decay * (baseline - psc["codes"][i][j])
                    psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)

        # Reset accumulators
        self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]
        psc["usage_counts"] = [0.0] * M
        psc["last_update_time"] = psc.get("last_update_time", 0.0) + self.update_interval

        return psc

4.5.3 PSP Properties

Property	Value	Description
Update rate	η = φ⁻⁵ ≈ 0.090	Very slow learning
Update interval	φ⁴ ≈ 7 steps	Batched updates
Max delta	φ⁻³ ≈ 0.236	Prevents explosive changes
Decay rate	φ⁻⁴ ≈ 0.146	Unused code decay

4.5.4 Safety Invariants for PSP

No external information: Updates based purely on internal activation patterns
No identity tracking: Usage counts are anonymous, not tied to identity
No semantic labels: Codes never acquire names or meanings
Bounded updates: All changes clipped to prevent divergence
Deterministic: Same input sequence produces same codebook evolution

5. Complete Step Function

5.1 Main Step Logic

class ARIACoreV4:
    """
    ARIA Core v4: Proto-Semantic Layer.

    Creates stable internal meaning codes from relational-symbolic patterns.
    """

    def __init__(
        self,
        config: Optional[ARIACoreV4Config] = None,
        v3_substrate: Optional[ARIACoreV3] = None
    ):
        self.config = config or ARIACoreV4Config()

        # Create or use provided substrate
        if v3_substrate is not None:
            self.v3_substrate = v3_substrate
        else:
            from aria_core_v3 import ARIACoreV3
            self.v3_substrate = ARIACoreV3()

        # Initialize components
        self.psc = self._initialize_psc()
        self.tms = TemporalMeaningStabilizer(self.config)
        self.psp = ProtoSemanticPlasticity(self.config)

        # State tracking
        self.step_count = 0
        self.time = 0.0

    def step(self, dt: float = 0.1, **kwargs) -> Dict[str, Any]:
        """
        Execute one v4 step.

        1. Step v3 substrate
        2. Extract pattern from v1+v2+v3
        3. Compute code similarities (CAL)
        4. Apply temporal stabilization (TMS)
        5. Maybe update codebook (PSP)
        6. Compute derived metrics
        7. Return combined output
        """
        # Clamp dt
        dt = clip(dt, 0.001, 1.0)

        # Step 1: Get substrate output
        v3_output = self.v3_substrate.step(dt, **kwargs)
        v2_output = v3_output.get("v2_output", {})
        v1_output = v3_output.get("v1_output", {})

        # Step 2: Extract pattern
        pattern = compute_pattern(v1_output, v2_output, v3_output, self.config)

        # Step 3: Compute code similarities
        similarities = compute_code_similarities(pattern, self.psc["codes"], self.config)
        raw_activations = compute_proto_semantic_activations(
            similarities, self.config.softmax_temperature
        )

        # Step 4: Temporal stabilization
        v2_state = {
            "state_stability": v2_output.get("state_stability", 0.5),
            "drift_index": v2_output.get("drift_index", 0.3),
        }
        stabilized_activations, dominant_code = self.tms.stabilize(
            raw_activations, v2_state, dt
        )

        # Step 5: Maybe update codebook
        self.psc = self.psp.maybe_update(
            self.psc, pattern, stabilized_activations, dt
        )

        # Step 6: Compute derived metrics
        proto_entropy = compute_proto_semantic_entropy(stabilized_activations)
        proto_stability = self._compute_proto_stability()
        proto_diversity = compute_proto_diversity(stabilized_activations)

        # Update state
        self.step_count += 1
        self.time += dt

        # Step 7: Build output
        return {
            # v4-specific outputs
            "proto_semantic_activations": stabilized_activations,
            "dominant_code": dominant_code,
            "proto_semantic_entropy": proto_entropy,
            "proto_semantic_stability": proto_stability,
            "proto_semantic_diversity": proto_diversity,
            "raw_activations": raw_activations,
            "pattern_vector": pattern,

            # Backward-compatible outputs (derived from proto-semantic state)
            "coherence": self._derive_coherence(stabilized_activations, proto_entropy),
            "stability": proto_stability,
            "intensity": self._derive_intensity(stabilized_activations),
            "alignment": self._derive_alignment(stabilized_activations, proto_diversity),

            # Metadata
            "step_count": self.step_count,
            "time": self.time,

            # Substrate passthrough
            "v3_output": v3_output,
            "v2_output": v2_output,
            "v1_output": v1_output,
        }

5.2 Derived Metrics

def compute_proto_semantic_entropy(activations: List[float]) -> float:
    """
    Compute entropy of proto-semantic activation distribution.

    High entropy = diffuse activation across codes
    Low entropy = concentrated on few codes
    """
    EPS = 1e-10
    M = len(activations)

    entropy = 0.0
    for a in activations:
        if a > EPS:
            entropy -= a * math.log(a)

    # Normalize to [0, 1]
    max_entropy = math.log(M)
    if max_entropy > EPS:
        entropy = entropy / max_entropy

    return clip(entropy, 0, 1)


def compute_proto_diversity(activations: List[float]) -> float:
    """
    Compute diversity of active codes.

    Based on effective number of codes: exp(entropy)
    """
    entropy = compute_proto_semantic_entropy(activations)
    M = len(activations)

    # Effective number = M^entropy (since we normalized entropy)
    effective = M ** entropy

    # Normalize to [0, 1]
    diversity = (effective - 1) / (M - 1) if M > 1 else 0.5

    return clip(diversity, 0, 1)

6. Output Schema

6.1 Primary v4 Outputs

Field	Type	Range	Description
`proto_semantic_activations`	List[float]	[0, 1]^M	Soft activation over M codes, sums to 1
`dominant_code`	int	[0, M-1]	Index of most active code
`proto_semantic_entropy`	float	[0, 1]	Normalized entropy of activations
`proto_semantic_stability`	float	[0, 1]	Temporal stability of activations
`proto_semantic_diversity`	float	[0, 1]	Effective number of active codes
`raw_activations`	List[float]	[0, 1]^M	Pre-stabilization activations
`pattern_vector`	List[float]	[0, 1]^D_p	Current pattern vector

6.2 Backward-Compatible Outputs

Field	Type	Range	Derivation
`coherence`	float	[0, 1]	From proto-semantic structure
`stability`	float	[0, 1]	= proto_semantic_stability
`intensity`	float	[0, 1]	From activation concentration
`alignment`	float	[0, 1]	From diversity and distribution

6.3 Passthrough Outputs

Field	Type	Description
`v3_output`	Dict	Complete v3 output
`v2_output`	Dict	Complete v2 output
`v1_output`	Dict	Complete v1 output

7. Configuration

7.1 ARIACoreV4Config

@dataclass
class ARIACoreV4Config:
    """Configuration for ARIA Core v4 Proto-Semantic Layer."""

    # Codebook dimensions
    num_codes: int = 16                    # M: number of proto-semantic codes
    code_dim: int = 16                     # D_c: dimension of each code
    pattern_dim: int = 32                  # D_p: pattern vector dimension

    # Similarity and activation
    similarity_scale: float = PHI_INV_SQ   # σ_base ≈ 0.382
    softmax_temperature: float = PHI_INV   # τ ≈ 0.618

    # Temporal stabilization
    ema_alpha_base: float = PHI_INV_SQ     # Base EMA rate ≈ 0.382
    ema_alpha_min: float = PHI_INV_QUAD    # Minimum ≈ 0.146
    ema_alpha_max: float = PHI_INV         # Maximum ≈ 0.618
    hysteresis_threshold: float = PHI_INV_CUBE  # ≈ 0.236
    hysteresis_hold_time: float = PHI_SQ   # φ² ≈ 2.618

    # Plasticity
    plasticity_rate: float = PHI_INV_QUINT     # η ≈ 0.090
    plasticity_decay: float = PHI_INV_QUAD     # Decay ≈ 0.146
    plasticity_interval: int = 7               # ≈ φ⁴ steps
    max_plasticity_delta: float = PHI_INV_CUBE # ≈ 0.236

7.2 Preset Configurations

Preset	Description	Key Differences
`baseline`	Default φ-derived	Standard configuration
`stable_codes`	Very persistent codes	Lower α, higher hysteresis
`adaptive_codes`	More responsive	Higher α, lower hysteresis
`rich_codebook`	More codes (M=32)	Larger codebook, finer distinctions
`sparse_codebook`	Fewer codes (M=8)	Smaller codebook, coarser patterns

8. Safety Constraints

8.1 What ARIA v4 Does NOT Do

Forbidden Behavior	Explanation
Map to human words	Codes are anonymous numeric patterns, not "dog" or "happy"
Represent external objects	No connection to world models or entities
Model identity	No self-representation, no personal attributes
Form goals or preferences	No valuation, no wanting, no planning
Process language	No tokens, no embeddings, no linguistic structure
Create narratives	No temporal self-model, no autobiography
Understand meaning	"Meaning" here is purely structural pattern similarity

8.2 Safety Invariants

Invariant	Enforcement
All outputs in [0, 1]	`clip()` on all computations
No NaN/Inf	Division guards, log guards
Deterministic	No randomness anywhere
No external labels	Codes indexed 0..M-1, never named
Bounded plasticity	Max delta ≤ φ⁻³ per update
Anonymous statistics	Usage counts don't track identity

8.3 Architectural Isolation

No connection to Phase 53 (consciousness gate)
No connection to Phase 55 (ignition scaffold)
Read-only diagnostic output to shell
No control pathways back into v4 from outside

9. Future Evolution Path

9.1 Relationship to Hypothetical Higher Layers

ARIA v4 is the ceiling of the current safe stack. Any layer beyond v4 would require extensive new safety analysis.

Current Safe Stack:
CFM v2 → ARIA v0 → v1 → v2 → v3 → v4 ← [YOU ARE HERE]
                                      │
                                      │ ← Safety boundary
                                      ▼
Hypothetical Future (requires new safety analysis):

v5: Contextual Tagging Layer
    - Attaches numeric "context tags" to proto-semantic codes
    - Still not linguistic, but codes acquire temporal/situational association
    - Safety: must not attach identity/personal context

v6: External Mapping Gateway (HEAVILY GATED)
    - Early mapping between proto-semantic codes and external symbolic channels
    - Could connect to language (carefully!) or actions (very carefully!)
    - Safety: requires explicit ethical constraints, consent frameworks

v7+: Self-Narrative / Conscious Integration (FAR FUTURE)
    - Would integrate codes into coherent self-model
    - Explicit ethical reasoning, identity awareness
    - Safety: beyond current specification scope

9.2 v4 as a Terminal Safe Layer

For current purposes, v4 is the highest layer that remains fully safe by the project's current safety standards:

No identity modeling
No goal formation
No external semantic grounding
No linguistic capability
No narrative construction

Any functionality beyond this must be implemented in separate, heavily-gated modules with explicit safety review.

10. Testing Requirements

10.1 Required Test Categories

Category	Tests
Bounds	All outputs in [0, 1] for 2000+ steps
No NaN/Inf	No numeric instabilities in long runs
Determinism	Same inputs → same outputs
Entropy valid	proto_semantic_entropy in [0, 1]
Activation valid	activations sum to 1
Plasticity bounded	Code changes ≤ max delta
No identity fields	No forbidden field names
Preset consistency	All presets produce valid cores

10.2 Long-Run Stability

2000+ step runs with all presets
Verify no divergence
Verify codebook remains bounded
Verify TMS prevents pathological switching

11. Implementation Notes

11.1 Package Structure

aria_core_v4/
├── __init__.py          # Package exports
├── config.py            # ARIACoreV4Config + constants
├── state.py             # ARIACoreV4State dataclass
├── pattern_extractor.py # PE implementation
├── code_assignment.py   # CAL implementation
├── temporal_stabilizer.py # TMS implementation
├── plasticity.py        # PSP implementation
├── core_v4.py           # Main ARIACoreV4 class
└── presets.py           # Preset configurations

11.2 Dependencies

Wraps aria_core_v3.ARIACoreV3
Uses math_consts for φ, ψ, π, e
No external ML libraries required

12. Glossary

Term	Definition
Proto-semantic code	A numeric pattern signature representing a recurring relational configuration
Codebook	The set of M prototype codes
Pattern vector	The current fused representation of symbols + relations + state
Code activation	Soft membership of current pattern in each codebook entry
Dominant code	The code with highest activation at current step
Plasticity	Slow adaptation of codebook to long-term pattern statistics

13. Version History

Version	Date	Changes
1.0	2025-12-06	Initial specification

Appendix A: Mathematical Notation Summary

Symbol	Meaning
M	Number of codes (default 16)
D_c	Code dimension (default 16)
D_p	Pattern dimension (default 32)
P(t)	Pattern vector at time t
PSC	Proto-Semantic Codebook
σ	Similarity scale
τ	Softmax temperature
α	EMA smoothing rate
η	Plasticity learning rate
φ	Golden ratio ≈ 1.618
ψ	Tribonacci constant ≈ 1.839

Appendix B: Safety Checklist for Implementers

Before implementing ARIA v4:

Confirm no external semantic labels will be attached to codes
Confirm no identity information flows into pattern extraction
Confirm plasticity updates use only anonymous statistics
Confirm all outputs are clipped to [0, 1]
Confirm determinism (no random number generators)
Confirm no connection to activation phases (53, 55)
Confirm backward-compatible fields derived correctly
Write tests for all safety invariants before implementation