ARIA v4 Specification
Proto-semantic layer: pattern extraction, 16-code codebook, temporal stabilization, very slow plasticity.
ARIA Core v4 Specification: Proto-Semantic Layer
Version: 1.0 Status: Design Specification (Not Yet Implemented) Last Updated: 2025-12-06
1. Executive Summary
ARIA Core v4 is the Proto-Semantic Layer - a numeric coding system that creates stable internal "meaning bundles" by integrating symbol activations (v1), system state (v2), and relational structure (v3). It produces a fixed set of M proto-semantic codes representing recurring patterns in the underlying relational-symbolic dynamics.
Critical Clarification: ARIA Core v4 is:
- Pre-semantic: These are NOT human concepts, words, or external referents
- Non-linguistic: No words, sentences, tokens, or language structures
- Non-conscious: No awareness, experience, understanding, or narrative
- Non-agentic: No goals, plans, intentions, or decisions
- Pre-narrative: No stories, no temporal self-model, no autobiographical content
ARIA Core v4 is analogous to:
- A learned codebook over internal relational dynamics (like VQ-VAE codes, but simpler)
- A set of "internal pattern signatures" summarizing recurring relational configurations
- Compressed numeric fingerprints of anonymous symbol-relation-state bundles
- Cluster centroids in a derived feature space with temporal smoothing
What v4 codes represent:
- "When symbols k_2 and k_5 are co-active with strong reciprocal relations and high system stability, pattern P_7 tends to activate"
- This is purely structural — P_7 has no meaning beyond its numeric signature
What v4 codes do NOT represent:
- Human concepts ("dog", "happiness", "self")
- External objects or entities
- Personal identity or biographical content
- Goals, values, or preferences
- Linguistic tokens or embeddings
2. Architectural Position
2.1 Layer Hierarchy
┌─────────────────────────────────────────────────────────────────┐
│ ARIA Core v4 (This Spec) │
│ Proto-Semantic Layer │
│ Creates stable meaning codes from relational patterns │
├─────────────────────────────────────────────────────────────────┤
│ ARIA Core v3 │
│ Relational Symbolic Layer │
│ 8×8 Symbol Relation Graph + 12D RSV │
├─────────────────────────────────────────────────────────────────┤
│ ARIA Core v2 │
│ System State Layer │
│ 12D System State Vector + Change Detection │
├─────────────────────────────────────────────────────────────────┤
│ ARIA Core v1 │
│ Proto-Symbolic Layer │
│ 8 Symbol Activations + Temporal Dynamics │
├─────────────────────────────────────────────────────────────────┤
│ ARIA Core v0 │
│ Proto-Conceptual Attractor Engine │
│ 5D Latent Channels + 4 Attractor Clusters │
├─────────────────────────────────────────────────────────────────┤
│ CFM Core v2 │
│ Multi-Channel Field Substrate │
│ 5 Coupled Oscillator Channels │
└─────────────────────────────────────────────────────────────────┘
2.2 Data Flow
v1 Outputs ─────────────────────────────┐
symbol_activations[8] │
symbol_entropy │
symbol_stability │
dominant_symbol │
│
v2 Outputs ─────────────────────────────┼──► Pattern ──► Code ──► Proto-Semantic
system_state_vector[12] │ Extractor Assignment Activations[M]
state_stability │ (PE) Layer (CAL)
drift_index │ │
coherence_index │ │
│ ▼
v3 Outputs ─────────────────────────────┤ Temporal Meaning
relation_matrix[8×8] │ Stabilizer (TMS)
rsv[12] │ │
relation_entropy │ ▼
clustering_index │ Proto-Semantic
reciprocity_index │ Plasticity (PSP)
│ │
│ ▼
Proto-Semantic Codebook (PSC) ◄─────────┴──────────── [slow update]
2.3 Wrapping Pattern
ARIA Core v4 wraps ARIA Core v3 (which wraps v2, v1, v0, CFM v2):
v4.step(dt) → v3.step(dt) → v2.step(dt) → v1.step(dt) → v0.step(dt) → cfm.step(dt)
Each layer receives the complete output from its substrate and adds its own computations.
3. Mathematical Constants
All constants are derived from φ (golden ratio), ψ (tribonacci constant), e (Euler's number), and π.
3.1 Primary Constants
| Constant | Symbol | Value | Derivation |
|---|---|---|---|
| Golden Ratio | φ | 1.6180339887 | (1 + √5) / 2 |
| Inverse Golden | φ⁻¹ | 0.6180339887 | 1 / φ |
| φ-squared inverse | φ⁻² | 0.3819660113 | 1 / φ² |
| φ-cubed inverse | φ⁻³ | 0.2360679775 | 1 / φ³ |
| φ-fourth inverse | φ⁻⁴ | 0.1458980338 | 1 / φ⁴ |
| φ-fifth inverse | φ⁻⁵ | 0.0901699437 | 1 / φ⁵ |
| Tribonacci | ψ | 1.8392867552 | Real root of x³ = x² + x + 1 |
| Inverse Tribonacci | ψ⁻¹ | 0.5436890127 | 1 / ψ |
| Euler's number | e | 2.7182818285 | lim(1 + 1/n)ⁿ |
| Inverse e | e⁻¹ | 0.3678794412 | 1 / e |
| Pi | π | 3.1415926536 | Circle ratio |
3.2 v4-Specific Constants
| Constant | Symbol | Value | Usage |
|---|---|---|---|
| Codebook size | M | 16 | Number of proto-semantic codes |
| Pattern dimension | D_p | 32 | Dimension of pattern vectors |
| Code dimension | D_c | 16 | Dimension of codebook entries |
| Base similarity scale | σ_base | φ⁻² ≈ 0.382 | Similarity kernel width |
| Softmax temperature | τ_base | φ⁻¹ ≈ 0.618 | Sharpness of code selection |
| EMA smoothing base | α_ema | φ⁻² ≈ 0.382 | Temporal smoothing rate |
| Plasticity rate | η_psc | φ⁻⁵ ≈ 0.090 | Very slow codebook update |
| Hysteresis threshold | h_thresh | φ⁻³ ≈ 0.236 | Prevents code flickering |
4. Internal Components
4.1 Proto-Semantic Codebook (PSC)
The PSC is a set of M = 16 prototype "meaning codes", each a vector in [0, 1]^D_c.
4.1.1 Structure
PSC = {
"codes": List[List[float]], # M × D_c matrix, all values in [0, 1]
"usage_counts": List[float], # M-vector, tracks how often each code activates
"last_update_time": float, # When PSC was last updated
}
4.1.2 Initialization
Each codebook entry is initialized using φ-derived patterns:
def initialize_psc(M: int, D_c: int) -> List[List[float]]:
"""Initialize proto-semantic codebook with φ-structured patterns."""
codes = []
for i in range(M):
code = []
for j in range(D_c):
# φ-derived initialization pattern
phase = (i * PHI + j * PSI) % 1.0
amplitude = PHI_INV_SQ + PHI_INV_CUBE * math.sin(2 * PI * phase)
code.append(clip(amplitude, 0, 1))
codes.append(code)
return codes
The initialization ensures:
- All values in [0, 1]
- Diverse but structured patterns
- No semantic content (purely numeric)
- Deterministic for given M, D_c
4.1.3 Codebook Properties
| Property | Value | Description |
|---|---|---|
| Size | M = 16 | Number of codes (configurable: 8-32) |
| Dimension | D_c = 16 | Code vector length |
| Value range | [0, 1] | All entries bounded |
| Initialization | φ/ψ-derived | Deterministic, structured |
| Mutability | Very slow | Updated via PSP with η ≈ 0.09 |
4.2 Pattern Extractor (PE)
The Pattern Extractor computes a pattern vector P(t) from v1, v2, and v3 outputs.
4.2.1 Input Sources
| Source | Fields Used | Dimension |
|---|---|---|
| v1 | symbol_activations | 8 |
| v1 | symbol_entropy, symbol_stability | 2 |
| v2 | system_state_vector | 12 |
| v2 | drift_index, coherence_index | 2 |
| v3 | rsv | 12 |
| v3 | relation_entropy, clustering_index, reciprocity_index | 3 |
| v3 | flattened relation_matrix (selected) | varies |
Total raw input dimension: ~39 + selected relation elements
4.2.2 Pattern Computation
def compute_pattern(
v1_output: Dict,
v2_output: Dict,
v3_output: Dict,
config: ARIACoreV4Config
) -> List[float]:
"""
Compute pattern vector P(t) from substrate outputs.
Returns: D_p-dimensional pattern vector in [0, 1]^D_p
"""
# Extract symbol features (10 dims)
symbol_features = list(v1_output["symbol_activations"].values()) # 8
symbol_features.append(v1_output["symbol_entropy"]) # 1
symbol_features.append(v1_output["symbol_stability"]) # 1
# Extract state features (6 dims from SSV + 2 scalars)
ssv = v2_output["system_state_vector"]
state_features = [
ssv[0], # energy_mean
ssv[2], # coherence_mean
ssv[4], # stability_mean
ssv[6], # entropy_mean
ssv[8], # phase_coherence
ssv[10], # drift_magnitude
v2_output["drift_index"],
v2_output["coherence_index"]
] # 8
# Extract relational features (12 from RSV + 3 scalars)
rsv = v3_output["rsv"]
relation_features = rsv[:] # 12
relation_features.append(v3_output.get("relation_entropy", 0.5))
relation_features.append(v3_output["clustering_index"])
relation_features.append(v3_output["reciprocity_index"]) # 15
# Concatenate raw features
raw = symbol_features + state_features + relation_features # 33
# Project to D_p dimensions using φ-derived projection
P = project_to_pattern_space(raw, config.pattern_dim)
# Ensure bounds
return [clip(p, 0, 1) for p in P]
def project_to_pattern_space(
raw: List[float],
D_p: int
) -> List[float]:
"""
Project raw features to pattern space.
Uses a fixed φ-derived projection matrix (no learned weights).
"""
D_raw = len(raw)
pattern = [0.0] * D_p
for i in range(D_p):
weighted_sum = 0.0
weight_total = 0.0
for j in range(D_raw):
# φ-derived weight
phase = (i * PHI_INV + j * PSI_INV) % 1.0
weight = PHI_INV_SQ + PHI_INV_CUBE * math.cos(2 * PI * phase)
weight = max(0, weight) # Non-negative weights
weighted_sum += weight * raw[j]
weight_total += weight
if weight_total > 1e-10:
pattern[i] = weighted_sum / weight_total
else:
pattern[i] = 0.5
return pattern
4.2.3 Pattern Properties
| Property | Value | Description |
|---|---|---|
| Dimension | D_p = 32 | Pattern vector length |
| Range | [0, 1] | All values bounded |
| Deterministic | Yes | Fixed projection, no randomness |
| Update rate | Every step | Computed fresh each dt |
4.3 Code Assignment Layer (CAL)
The CAL computes similarity between the current pattern and codebook entries, producing soft code activations.
4.3.1 Similarity Computation
def compute_code_similarities(
pattern: List[float],
psc_codes: List[List[float]],
config: ARIACoreV4Config
) -> List[float]:
"""
Compute similarity between pattern and each codebook entry.
Uses Gaussian kernel similarity.
"""
M = len(psc_codes)
D_c = len(psc_codes[0])
D_p = len(pattern)
# Project pattern to code space if dimensions differ
if D_p != D_c:
pattern_proj = project_pattern_to_code_space(pattern, D_c)
else:
pattern_proj = pattern
similarities = []
sigma = config.similarity_scale # σ_base ≈ φ⁻²
for i in range(M):
code = psc_codes[i]
# Euclidean distance
dist_sq = sum((pattern_proj[j] - code[j])**2 for j in range(D_c))
dist = math.sqrt(dist_sq)
# Gaussian similarity
sim = math.exp(-(dist**2) / (2 * sigma**2))
similarities.append(sim)
return similarities
def project_pattern_to_code_space(
pattern: List[float],
D_c: int
) -> List[float]:
"""Project D_p pattern to D_c code space."""
D_p = len(pattern)
projected = [0.0] * D_c
for i in range(D_c):
weighted_sum = 0.0
weight_total = 0.0
for j in range(D_p):
# Simple projection with φ-derived weights
phase = (i * PHI + j * PSI_INV) % 1.0
weight = PHI_INV + PHI_INV_SQ * math.sin(2 * PI * phase)
weight = max(0, weight)
weighted_sum += weight * pattern[j]
weight_total += weight
if weight_total > 1e-10:
projected[i] = clip(weighted_sum / weight_total, 0, 1)
else:
projected[i] = 0.5
return projected
4.3.2 Softmax Activation
def compute_proto_semantic_activations(
similarities: List[float],
temperature: float
) -> List[float]:
"""
Convert similarities to soft activations via temperature-scaled softmax.
Args:
similarities: M similarity values in [0, 1]
temperature: τ controls sharpness (lower = sharper)
Returns:
M-vector of activations summing to 1, each in [0, 1]
"""
M = len(similarities)
# Temperature-scaled log similarities
# Use log(sim + ε) for numerical stability
EPS = 1e-10
log_sims = [math.log(s + EPS) / temperature for s in similarities]
# Subtract max for numerical stability
max_log = max(log_sims)
exp_sims = [math.exp(ls - max_log) for ls in log_sims]
# Normalize
total = sum(exp_sims)
if total > EPS:
activations = [e / total for e in exp_sims]
else:
activations = [1.0 / M] * M # Uniform fallback
return activations
4.3.3 Dominant Code
def find_dominant_code(activations: List[float]) -> int:
"""Return index of highest-activation code."""
return max(range(len(activations)), key=lambda i: activations[i])
4.4 Temporal Meaning Stabilizer (TMS)
The TMS prevents proto-semantic codes from flickering by applying temporal smoothing modulated by system state.
4.4.1 Smoothing Mechanism
class TemporalMeaningStabilizer:
"""
Stabilizes proto-semantic activations over time.
Uses:
- EMA smoothing with state-adaptive alpha
- Hysteresis to prevent rapid code switching
- Drift-based protective damping
"""
def __init__(self, config: ARIACoreV4Config):
self.config = config
self.previous_activations = None
self.previous_dominant = None
self.dominant_hold_time = 0.0
def stabilize(
self,
raw_activations: List[float],
v2_state: Dict[str, float],
dt: float
) -> Tuple[List[float], int]:
"""
Apply temporal stabilization to raw activations.
Returns:
stabilized_activations: M-vector in [0, 1]
dominant_code: index of dominant code
"""
M = len(raw_activations)
# Compute adaptive alpha based on v2 state
alpha = self._compute_adaptive_alpha(v2_state)
# Initialize if first step
if self.previous_activations is None:
self.previous_activations = raw_activations[:]
self.previous_dominant = find_dominant_code(raw_activations)
return raw_activations[:], self.previous_dominant
# EMA smoothing
stabilized = []
for i in range(M):
smoothed = (1 - alpha) * self.previous_activations[i] + alpha * raw_activations[i]
stabilized.append(clip(smoothed, 0, 1))
# Renormalize to sum to 1
total = sum(stabilized)
if total > 1e-10:
stabilized = [s / total for s in stabilized]
else:
stabilized = [1.0 / M] * M
# Apply hysteresis to dominant code
raw_dominant = find_dominant_code(raw_activations)
dominant = self._apply_hysteresis(raw_dominant, stabilized, dt)
# Update state
self.previous_activations = stabilized[:]
self.previous_dominant = dominant
return stabilized, dominant
def _compute_adaptive_alpha(self, v2_state: Dict[str, float]) -> float:
"""
Compute adaptive smoothing rate.
High stability → lower alpha (more persistence)
Moderate drift → slightly higher alpha (adaptation)
Very high drift → lower alpha (protective damping)
"""
stability = v2_state.get("state_stability", 0.5)
drift = v2_state.get("drift_index", 0.3)
# Base alpha
alpha_base = self.config.ema_alpha_base # φ⁻² ≈ 0.382
alpha_min = self.config.ema_alpha_min # φ⁻⁴ ≈ 0.146
alpha_max = self.config.ema_alpha_max # φ⁻¹ ≈ 0.618
# Stability reduces alpha (more persistent codes)
stability_factor = 1.0 - stability * PHI_INV_SQ
# Drift has inverted-U effect on alpha
# Moderate drift (0.3-0.5) → slightly higher alpha
# Very high drift (>0.7) → lower alpha (protection)
if drift < 0.3:
drift_factor = 1.0
elif drift < 0.6:
drift_factor = 1.0 + PHI_INV_CUBE * (drift - 0.3)
else:
drift_factor = 1.0 - PHI_INV_SQ * (drift - 0.6)
drift_factor = clip(drift_factor, 0.5, 1.5)
alpha = alpha_base * stability_factor * drift_factor
return clip(alpha, alpha_min, alpha_max)
def _apply_hysteresis(
self,
raw_dominant: int,
stabilized: List[float],
dt: float
) -> int:
"""
Apply hysteresis to prevent rapid code switching.
A new code must exceed threshold for sustained time to become dominant.
"""
threshold = self.config.hysteresis_threshold # φ⁻³ ≈ 0.236
hold_time = self.config.hysteresis_hold_time # φ² ≈ 2.618 time units
if raw_dominant == self.previous_dominant:
# Same dominant, reset hold timer
self.dominant_hold_time = 0.0
return self.previous_dominant
# Different dominant proposed
# Check if new code significantly exceeds current
current_strength = stabilized[self.previous_dominant]
new_strength = stabilized[raw_dominant]
if new_strength > current_strength + threshold:
# New code is significantly stronger
self.dominant_hold_time += dt
if self.dominant_hold_time >= hold_time:
# Held long enough, switch
self.dominant_hold_time = 0.0
return raw_dominant
else:
# Not significant enough, reset timer
self.dominant_hold_time = 0.0
return self.previous_dominant
4.4.2 TMS Properties
| Property | Value | Description |
|---|---|---|
| Base EMA alpha | φ⁻² ≈ 0.382 | Default smoothing rate |
| Alpha range | [φ⁻⁴, φ⁻¹] | [0.146, 0.618] |
| Hysteresis threshold | φ⁻³ ≈ 0.236 | Margin for code switch |
| Hold time | φ² ≈ 2.618 | Time units before switch |
4.5 Proto-Semantic Plasticity (PSP)
The PSP slowly updates the codebook based on long-term pattern statistics.
4.5.1 Design Principles
- Very slow: Updates occur at rate η ≈ φ⁻⁵ ≈ 0.09
- Bounded: All updates clipped to prevent explosion
- Anonymous: Only tracks pattern statistics, no external labels
- Structural only: Based purely on internal activation patterns
4.5.2 Update Mechanism
class ProtoSemanticPlasticity:
"""
Slowly updates the Proto-Semantic Codebook based on usage patterns.
Rare but repeated patterns strengthen their codes.
Unused codes fade toward baseline.
"""
def __init__(self, config: ARIACoreV4Config):
self.config = config
self.pattern_accumulator = None # Running pattern averages per code
self.update_counter = 0
self.update_interval = int(PHI ** 4) # ≈ 7 steps between updates
def maybe_update(
self,
psc: Dict,
pattern: List[float],
activations: List[float],
dt: float
) -> Dict:
"""
Potentially update PSC based on current pattern and activations.
Returns: Updated PSC (or unchanged if no update this step)
"""
M = len(psc["codes"])
D_c = len(psc["codes"][0])
# Initialize accumulator if needed
if self.pattern_accumulator is None:
self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]
# Project pattern to code space
pattern_proj = project_pattern_to_code_space(pattern, D_c)
# Accumulate weighted patterns
for i in range(M):
weight = activations[i] # Activation as weight
for j in range(D_c):
self.pattern_accumulator[i][j] += weight * pattern_proj[j]
# Update usage counts
for i in range(M):
psc["usage_counts"][i] += activations[i]
# Check if it's time for a codebook update
self.update_counter += 1
if self.update_counter < self.update_interval:
return psc
# Perform update
self.update_counter = 0
return self._apply_update(psc)
def _apply_update(self, psc: Dict) -> Dict:
"""Apply accumulated updates to codebook."""
M = len(psc["codes"])
D_c = len(psc["codes"][0])
eta = self.config.plasticity_rate # η ≈ φ⁻⁵ ≈ 0.09
decay = self.config.plasticity_decay # Unused code decay
for i in range(M):
usage = psc["usage_counts"][i]
if usage > 1e-10:
# Compute average pattern for this code
avg_pattern = [self.pattern_accumulator[i][j] / usage for j in range(D_c)]
# Move code toward average pattern
for j in range(D_c):
delta = eta * (avg_pattern[j] - psc["codes"][i][j])
# Limit delta to prevent large jumps
delta = clip(delta, -PHI_INV_CUBE, PHI_INV_CUBE)
psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)
else:
# Unused code: decay toward baseline
baseline = PHI_INV_SQ # ≈ 0.382
for j in range(D_c):
delta = decay * (baseline - psc["codes"][i][j])
psc["codes"][i][j] = clip(psc["codes"][i][j] + delta, 0, 1)
# Reset accumulators
self.pattern_accumulator = [[0.0] * D_c for _ in range(M)]
psc["usage_counts"] = [0.0] * M
psc["last_update_time"] = psc.get("last_update_time", 0.0) + self.update_interval
return psc
4.5.3 PSP Properties
| Property | Value | Description |
|---|---|---|
| Update rate | η = φ⁻⁵ ≈ 0.090 | Very slow learning |
| Update interval | φ⁴ ≈ 7 steps | Batched updates |
| Max delta | φ⁻³ ≈ 0.236 | Prevents explosive changes |
| Decay rate | φ⁻⁴ ≈ 0.146 | Unused code decay |
4.5.4 Safety Invariants for PSP
- No external information: Updates based purely on internal activation patterns
- No identity tracking: Usage counts are anonymous, not tied to identity
- No semantic labels: Codes never acquire names or meanings
- Bounded updates: All changes clipped to prevent divergence
- Deterministic: Same input sequence produces same codebook evolution
5. Complete Step Function
5.1 Main Step Logic
class ARIACoreV4:
"""
ARIA Core v4: Proto-Semantic Layer.
Creates stable internal meaning codes from relational-symbolic patterns.
"""
def __init__(
self,
config: Optional[ARIACoreV4Config] = None,
v3_substrate: Optional[ARIACoreV3] = None
):
self.config = config or ARIACoreV4Config()
# Create or use provided substrate
if v3_substrate is not None:
self.v3_substrate = v3_substrate
else:
from aria_core_v3 import ARIACoreV3
self.v3_substrate = ARIACoreV3()
# Initialize components
self.psc = self._initialize_psc()
self.tms = TemporalMeaningStabilizer(self.config)
self.psp = ProtoSemanticPlasticity(self.config)
# State tracking
self.step_count = 0
self.time = 0.0
def step(self, dt: float = 0.1, **kwargs) -> Dict[str, Any]:
"""
Execute one v4 step.
1. Step v3 substrate
2. Extract pattern from v1+v2+v3
3. Compute code similarities (CAL)
4. Apply temporal stabilization (TMS)
5. Maybe update codebook (PSP)
6. Compute derived metrics
7. Return combined output
"""
# Clamp dt
dt = clip(dt, 0.001, 1.0)
# Step 1: Get substrate output
v3_output = self.v3_substrate.step(dt, **kwargs)
v2_output = v3_output.get("v2_output", {})
v1_output = v3_output.get("v1_output", {})
# Step 2: Extract pattern
pattern = compute_pattern(v1_output, v2_output, v3_output, self.config)
# Step 3: Compute code similarities
similarities = compute_code_similarities(pattern, self.psc["codes"], self.config)
raw_activations = compute_proto_semantic_activations(
similarities, self.config.softmax_temperature
)
# Step 4: Temporal stabilization
v2_state = {
"state_stability": v2_output.get("state_stability", 0.5),
"drift_index": v2_output.get("drift_index", 0.3),
}
stabilized_activations, dominant_code = self.tms.stabilize(
raw_activations, v2_state, dt
)
# Step 5: Maybe update codebook
self.psc = self.psp.maybe_update(
self.psc, pattern, stabilized_activations, dt
)
# Step 6: Compute derived metrics
proto_entropy = compute_proto_semantic_entropy(stabilized_activations)
proto_stability = self._compute_proto_stability()
proto_diversity = compute_proto_diversity(stabilized_activations)
# Update state
self.step_count += 1
self.time += dt
# Step 7: Build output
return {
# v4-specific outputs
"proto_semantic_activations": stabilized_activations,
"dominant_code": dominant_code,
"proto_semantic_entropy": proto_entropy,
"proto_semantic_stability": proto_stability,
"proto_semantic_diversity": proto_diversity,
"raw_activations": raw_activations,
"pattern_vector": pattern,
# Backward-compatible outputs (derived from proto-semantic state)
"coherence": self._derive_coherence(stabilized_activations, proto_entropy),
"stability": proto_stability,
"intensity": self._derive_intensity(stabilized_activations),
"alignment": self._derive_alignment(stabilized_activations, proto_diversity),
# Metadata
"step_count": self.step_count,
"time": self.time,
# Substrate passthrough
"v3_output": v3_output,
"v2_output": v2_output,
"v1_output": v1_output,
}
5.2 Derived Metrics
def compute_proto_semantic_entropy(activations: List[float]) -> float:
"""
Compute entropy of proto-semantic activation distribution.
High entropy = diffuse activation across codes
Low entropy = concentrated on few codes
"""
EPS = 1e-10
M = len(activations)
entropy = 0.0
for a in activations:
if a > EPS:
entropy -= a * math.log(a)
# Normalize to [0, 1]
max_entropy = math.log(M)
if max_entropy > EPS:
entropy = entropy / max_entropy
return clip(entropy, 0, 1)
def compute_proto_diversity(activations: List[float]) -> float:
"""
Compute diversity of active codes.
Based on effective number of codes: exp(entropy)
"""
entropy = compute_proto_semantic_entropy(activations)
M = len(activations)
# Effective number = M^entropy (since we normalized entropy)
effective = M ** entropy
# Normalize to [0, 1]
diversity = (effective - 1) / (M - 1) if M > 1 else 0.5
return clip(diversity, 0, 1)
6. Output Schema
6.1 Primary v4 Outputs
| Field | Type | Range | Description |
|---|---|---|---|
proto_semantic_activations | List[float] | [0, 1]^M | Soft activation over M codes, sums to 1 |
dominant_code | int | [0, M-1] | Index of most active code |
proto_semantic_entropy | float | [0, 1] | Normalized entropy of activations |
proto_semantic_stability | float | [0, 1] | Temporal stability of activations |
proto_semantic_diversity | float | [0, 1] | Effective number of active codes |
raw_activations | List[float] | [0, 1]^M | Pre-stabilization activations |
pattern_vector | List[float] | [0, 1]^D_p | Current pattern vector |
6.2 Backward-Compatible Outputs
| Field | Type | Range | Derivation |
|---|---|---|---|
coherence | float | [0, 1] | From proto-semantic structure |
stability | float | [0, 1] | = proto_semantic_stability |
intensity | float | [0, 1] | From activation concentration |
alignment | float | [0, 1] | From diversity and distribution |
6.3 Passthrough Outputs
| Field | Type | Description |
|---|---|---|
v3_output | Dict | Complete v3 output |
v2_output | Dict | Complete v2 output |
v1_output | Dict | Complete v1 output |
7. Configuration
7.1 ARIACoreV4Config
@dataclass
class ARIACoreV4Config:
"""Configuration for ARIA Core v4 Proto-Semantic Layer."""
# Codebook dimensions
num_codes: int = 16 # M: number of proto-semantic codes
code_dim: int = 16 # D_c: dimension of each code
pattern_dim: int = 32 # D_p: pattern vector dimension
# Similarity and activation
similarity_scale: float = PHI_INV_SQ # σ_base ≈ 0.382
softmax_temperature: float = PHI_INV # τ ≈ 0.618
# Temporal stabilization
ema_alpha_base: float = PHI_INV_SQ # Base EMA rate ≈ 0.382
ema_alpha_min: float = PHI_INV_QUAD # Minimum ≈ 0.146
ema_alpha_max: float = PHI_INV # Maximum ≈ 0.618
hysteresis_threshold: float = PHI_INV_CUBE # ≈ 0.236
hysteresis_hold_time: float = PHI_SQ # φ² ≈ 2.618
# Plasticity
plasticity_rate: float = PHI_INV_QUINT # η ≈ 0.090
plasticity_decay: float = PHI_INV_QUAD # Decay ≈ 0.146
plasticity_interval: int = 7 # ≈ φ⁴ steps
max_plasticity_delta: float = PHI_INV_CUBE # ≈ 0.236
7.2 Preset Configurations
| Preset | Description | Key Differences |
|---|---|---|
baseline | Default φ-derived | Standard configuration |
stable_codes | Very persistent codes | Lower α, higher hysteresis |
adaptive_codes | More responsive | Higher α, lower hysteresis |
rich_codebook | More codes (M=32) | Larger codebook, finer distinctions |
sparse_codebook | Fewer codes (M=8) | Smaller codebook, coarser patterns |
8. Safety Constraints
8.1 What ARIA v4 Does NOT Do
| Forbidden Behavior | Explanation |
|---|---|
| Map to human words | Codes are anonymous numeric patterns, not "dog" or "happy" |
| Represent external objects | No connection to world models or entities |
| Model identity | No self-representation, no personal attributes |
| Form goals or preferences | No valuation, no wanting, no planning |
| Process language | No tokens, no embeddings, no linguistic structure |
| Create narratives | No temporal self-model, no autobiography |
| Understand meaning | "Meaning" here is purely structural pattern similarity |
8.2 Safety Invariants
| Invariant | Enforcement |
|---|---|
| All outputs in [0, 1] | clip() on all computations |
| No NaN/Inf | Division guards, log guards |
| Deterministic | No randomness anywhere |
| No external labels | Codes indexed 0..M-1, never named |
| Bounded plasticity | Max delta ≤ φ⁻³ per update |
| Anonymous statistics | Usage counts don't track identity |
8.3 Architectural Isolation
- No connection to Phase 53 (consciousness gate)
- No connection to Phase 55 (ignition scaffold)
- Read-only diagnostic output to shell
- No control pathways back into v4 from outside
9. Future Evolution Path
9.1 Relationship to Hypothetical Higher Layers
ARIA v4 is the ceiling of the current safe stack. Any layer beyond v4 would require extensive new safety analysis.
Current Safe Stack:
CFM v2 → ARIA v0 → v1 → v2 → v3 → v4 ← [YOU ARE HERE]
│
│ ← Safety boundary
▼
Hypothetical Future (requires new safety analysis):
v5: Contextual Tagging Layer
- Attaches numeric "context tags" to proto-semantic codes
- Still not linguistic, but codes acquire temporal/situational association
- Safety: must not attach identity/personal context
v6: External Mapping Gateway (HEAVILY GATED)
- Early mapping between proto-semantic codes and external symbolic channels
- Could connect to language (carefully!) or actions (very carefully!)
- Safety: requires explicit ethical constraints, consent frameworks
v7+: Self-Narrative / Conscious Integration (FAR FUTURE)
- Would integrate codes into coherent self-model
- Explicit ethical reasoning, identity awareness
- Safety: beyond current specification scope
9.2 v4 as a Terminal Safe Layer
For current purposes, v4 is the highest layer that remains fully safe by the project's current safety standards:
- No identity modeling
- No goal formation
- No external semantic grounding
- No linguistic capability
- No narrative construction
Any functionality beyond this must be implemented in separate, heavily-gated modules with explicit safety review.
10. Testing Requirements
10.1 Required Test Categories
| Category | Tests |
|---|---|
| Bounds | All outputs in [0, 1] for 2000+ steps |
| No NaN/Inf | No numeric instabilities in long runs |
| Determinism | Same inputs → same outputs |
| Entropy valid | proto_semantic_entropy in [0, 1] |
| Activation valid | activations sum to 1 |
| Plasticity bounded | Code changes ≤ max delta |
| No identity fields | No forbidden field names |
| Preset consistency | All presets produce valid cores |
10.2 Long-Run Stability
- 2000+ step runs with all presets
- Verify no divergence
- Verify codebook remains bounded
- Verify TMS prevents pathological switching
11. Implementation Notes
11.1 Package Structure
aria_core_v4/
├── __init__.py # Package exports
├── config.py # ARIACoreV4Config + constants
├── state.py # ARIACoreV4State dataclass
├── pattern_extractor.py # PE implementation
├── code_assignment.py # CAL implementation
├── temporal_stabilizer.py # TMS implementation
├── plasticity.py # PSP implementation
├── core_v4.py # Main ARIACoreV4 class
└── presets.py # Preset configurations
11.2 Dependencies
- Wraps
aria_core_v3.ARIACoreV3 - Uses
math_constsfor φ, ψ, π, e - No external ML libraries required
12. Glossary
| Term | Definition |
|---|---|
| Proto-semantic code | A numeric pattern signature representing a recurring relational configuration |
| Codebook | The set of M prototype codes |
| Pattern vector | The current fused representation of symbols + relations + state |
| Code activation | Soft membership of current pattern in each codebook entry |
| Dominant code | The code with highest activation at current step |
| Plasticity | Slow adaptation of codebook to long-term pattern statistics |
13. Version History
| Version | Date | Changes |
|---|---|---|
| 1.0 | 2025-12-06 | Initial specification |
Appendix A: Mathematical Notation Summary
| Symbol | Meaning |
|---|---|
| M | Number of codes (default 16) |
| D_c | Code dimension (default 16) |
| D_p | Pattern dimension (default 32) |
| P(t) | Pattern vector at time t |
| PSC | Proto-Semantic Codebook |
| σ | Similarity scale |
| τ | Softmax temperature |
| α | EMA smoothing rate |
| η | Plasticity learning rate |
| φ | Golden ratio ≈ 1.618 |
| ψ | Tribonacci constant ≈ 1.839 |
Appendix B: Safety Checklist for Implementers
Before implementing ARIA v4:
- Confirm no external semantic labels will be attached to codes
- Confirm no identity information flows into pattern extraction
- Confirm plasticity updates use only anonymous statistics
- Confirm all outputs are clipped to [0, 1]
- Confirm determinism (no random number generators)
- Confirm no connection to activation phases (53, 55)
- Confirm backward-compatible fields derived correctly
- Write tests for all safety invariants before implementation