Transferable Foundations for Autonomous Systems
The sim is just the vehicle. These are the roads.
Every pattern below is domain-independent. We demonstrate them with a Mars colony sim because it makes the stakes visceral, but each one transfers directly to IoT fleets, trading systems, robotics, infrastructure monitoring, and autonomous AI agents.
This is not a game manual. It's an engineering reference. Each pattern includes: the problem it solves, the shape of the solution, the code that implements it, and the domains where it applies.
A system ticks forward. Each tick computes the new state, but the meaning of the change β what got better, what got worse, what flipped direction β is lost. The next tick has no memory of trajectory, only position.
After each tick, compute a delta object (the echo) that captures what changed, what events occurred, and what the current trajectory looks like. Append it to a bounded history. The next tick reads this echo to inform its own decisions.
// After each sol tick, build the echo frame
const echo = {
frame: s, // which tick
utc: new Date().toISOString(), // wall clock
delta: { // what changed
o2: post.o2 - pre.o2,
h2o: post.h2o - pre.h2o,
food: post.food - pre.food,
power:post.power- pre.power
},
events: newEv.map(e => ({ // what happened
type: e.type, severity: e.severity, desc: e.desc
})),
inertia: { ...echoInertia }, // derivative (Pattern 02)
reflexes_fired: reflexesSinceLastFrame, // what the nervous system did
cri: colonyRiskIndex, // compound health metric
alive: state.alive, cause: state.cause
};
echoHistory.push(echo);
if (echoHistory.length > 500) echoHistory.shift(); // bounded
The echo is NOT a log entry. It's structured, queryable data that downstream systems consume programmatically. Reflexes read it. Tasks emerge from it. The UI renders it. It's the nervous system's nerve impulse.
Oβ is at 80 kg. Is that fine? If it was 85 yesterday and 90 the day before, you're accelerating downward. If it was 75 yesterday, you're recovering. Position alone is insufficient. You need velocity and acceleration.
Compare the current delta to the previous delta. The difference is the velocity (second derivative). Track trend direction and system flips (producing β consuming).
function computeInertia() {
if (echoHistory.length < 2) return;
const curr = echoHistory[echoHistory.length - 1];
const prev = echoHistory[echoHistory.length - 2];
// Velocity = acceleration of each resource
echoInertia.o2_velocity = curr.delta.o2 - prev.delta.o2;
echoInertia.h2o_velocity = curr.delta.h2o - prev.delta.h2o;
echoInertia.food_velocity = curr.delta.food - prev.delta.food;
echoInertia.power_velocity= curr.delta.power - prev.delta.power;
// Trend classification
const currTotal = Math.abs(curr.delta.o2) + Math.abs(curr.delta.h2o);
const prevTotal = Math.abs(prev.delta.o2) + Math.abs(prev.delta.h2o);
echoInertia.engagement_trend =
currTotal > prevTotal + 0.5 ? 'accelerating' :
currTotal < prevTotal - 0.5 ? 'decelerating' : 'steady';
// System flips: production β consumption (the dangerous ones)
if (Math.sign(curr.delta.o2) !== Math.sign(prev.delta.o2)) {
echoInertia.discourse_flips.push({
system: 'o2',
from: prev.delta.o2 > 0 ? 'producing' : 'consuming',
to: curr.delta.o2 > 0 ? 'producing' : 'consuming'
});
}
}
discourse_flips are the most dangerous signal β a system that was net-producing just became net-consuming. That's not a threshold alarm. That's a trajectory reversal. No traditional monitoring catches this.
The main control loop runs every N seconds. But crises develop between ticks. By the time the next tick sees the problem, it may be too late. You need a faster response that doesn't wait for the full reasoning cycle.
After computing inertia, evaluate a set of reflex conditions. If triggered, the reflex immediately modifies state (the muscle) and logs itself. The next tick sees the reflex in its echo and can adjust.
function computeReflexArcs() {
activeReflexes = [];
const s = state;
// REFLEX: Oβ crash trajectory
if (echoInertia.o2_velocity < -0.3 && s.o2 / (O2_PP * n) < 15) {
const severity = Math.min(1, Math.abs(echoInertia.o2_velocity));
const boost = 0.05 * severity;
activeReflexes.push({
id: 'o2_trajectory_warning',
condition: `Oβ velocity ${echoInertia.o2_velocity.toFixed(2)} kg/solΒ²`,
action: 'o2_reflex',
intensity: severity,
stateEffect: () => {
// MUSCLE: nudge ISRU allocation β real state change
s.alloc.i = Math.min(0.8, s.alloc.i + boost);
s.alloc.g = Math.max(0.1, s.alloc.g - boost / 2);
}
});
}
// Apply all state effects immediately
activeReflexes.forEach(r => {
if (r.stateEffect) r.stateEffect();
logReflexFire(r);
});
}
The reflex doesn't ask permission. It acts. Then it logs what it did, and the next tick can review and override. This is the difference between a monitoring system (alerts, waits for human) and a nervous system (acts, logs, learns).
The logged reflexes appear in the next echo as reflexes_fired[] β the organism's memory of its own involuntary reactions.
| Layer | Clock | Reads | Writes | Analogy |
|---|---|---|---|---|
| Cortex | Slow (per tick) | Full state, history | New state, echo | CEO decision |
| Brainstem | 1:1 with cortex | Pre/post state | Echo frame | Status report |
| Spinal Cord | 1:1 with cortex | Echo, inertia | State mutations | Auto-pilot |
| Patrol | ~20Hz (fast) | Active reflexes | Visual effects, alerts | Security camera |
The layers are decoupled by clock speed. The cortex can take 500ms to think. Patrol runs at 20Hz regardless. This means the system is visually responsive even when the decision engine is busy. In production: your monitoring dashboard updates at 60fps even when your batch job runs every 5 minutes.
State lives in memory. If the browser tab closes, the organism dies. Users can't back up, transfer, compare, or replay their runs. There's no portable unit of "a simulation in progress."
Define a cartridge format β a single JSON file that captures: configuration, full state, echo history, decision log, reflex memory, scoring, and metadata. Export downloads it. Import restores it. Auto-save persists it. The format is versioned and self-describing.
// The cartridge schema
{
_format: 'mars-barn-cartridge', // self-identifying
version: 1, // schema version
id: 'MBC-LZ4K2-A1B2', // unique run ID
created: '2025-07-09T14:30:00Z', // timestamp
mission: 'ares', // which template
config: { // what the player chose
arch: 'engineer',
lispy: 'adaptive_governor',
lispyCode: '(begin ...)',
simSpeed: 2,
},
state: { ... }, // full colony state (deep clone)
echoHistory: [ ... ], // last 100 echo frames
echoInertia: { ... }, // current derivatives
taskHistory: [ ... ], // every decision made
reflexHistory: [ ... ], // reflex fire log
supplyChain: { ... }, // logistics state
score: { // computed at save time
total: 12500,
grade: 'B',
breakdown: [
['Survival (sols Γ 100)', 8000],
['Crew alive bonus', 2000],
...
]
}
}
The cartridge is not a save file. It's a portable organism. It contains everything needed to resume the simulation on any machine, in any browser, at any time. Like a Datasette cartridge β the data IS the application state. No server required.
Multiple agents/players/systems need to operate in the same environment but run independently. You need a shared source of truth for environmental conditions without requiring real-time coordination or a central server.
Publish frames as immutable JSON files to a git repo (or any static host). Each frame has a hash. A manifest indexes all frames. A latest.json pointer tells clients what's newest. Clients fetch, consume, and react independently.
// Manifest: index of all available frames
{ "version": 1, "total_frames": 100, "first_sol": 1, "last_sol": 100,
"frames": [
{ "sol": 1, "hash": "8603407d19123eb0", "size": 692 },
{ "sol": 2, "hash": "9f028d4feae6df1c", "size": 715 },
...
]
}
// Individual frame: one sol of environmental data
{ "sol": 47, "utc": "2025-08-25T14:30:00Z",
"mars": { "temp_k": 218, "dust_tau": 0.15, "solar_wm2": 480, ... },
"events": [{ "type": "dust_devil", "severity": 0.3 }],
"hazards": [{ "type": "micrometeorite", "probability": 0.02 }],
"challenge": { "type": "solar_tracking_fault", "params": { ... } },
"_hash": "a1b2c3d4e5f6g7h8" // SHA-256 for verification
}
// Client consumption
async function loadPublicFrames() {
const manifest = await fetch(FRAME_BASE + '/manifest.json').then(r => r.json());
latestPublicSol = manifest.last_sol;
frameMode = 'public';
}
function applyPublicFrame(sol) {
const frame = publicFrames[sol];
if (!frame) return false;
marsWeather = { ...marsWeather, ...frame.mars }; // override environment
frame.events.forEach(ev => state.events.push(ev)); // inject events
return true;
}
Git IS the database. GitHub Pages IS the CDN. No backend, no API, no server. The frame files are static, immutable, and globally cacheable. The hash makes tampering detectable. This scales to millions of clients with zero infrastructure cost.
How do you compare two autonomous systems fairly? If they face different environments, you can't tell if System A is better or just luckier. You need controlled conditions with open competition.
All competitors consume the same public frame sequence. Each configures their own system differently (crew, algorithm, parameters). The score is computed from the same formula. When a competitor catches up to the latest frame, their system breathes on inertia until the next frame arrives.
When caught up to the latest frame, the organism doesn't stop. The echo inertia + reflex arcs keep it alive (Pattern 02 + 03). The nervous system breathes between heartbeats. When the next frame arrives, it's a fresh environmental signal that the system reacts to. This creates a real-time "will they survive?" tension perfect for livestreaming.
The CRI is computed by the LisPy VM (the same policy engine that makes allocation decisions) from N state variables. It provides a unified health signal that other systems reference.
// CRI computation (LisPy program)
(begin
(define resource_stress (+ (* (- 1 (/ o2_days 30)) 25)
(* (- 1 (/ h2o_days 30)) 25)
(* (- 1 (/ food_days 30)) 25)))
(define power_stress (* (- 1 (/ power 500)) 15))
(define crew_stress (* (- 1 (/ crew_health 100)) 10))
(set! colony_risk_index (min 100 (max 0
(+ resource_stress power_stress crew_stress))))
)
// Risk-weighted probability
function riskRoll(baseProb) {
const multiplier = 1 + (colonyRiskIndex / 50);
return Math.random() < (baseProb * multiplier);
// At CRI 0: multiplier = 1.0Γ (nominal)
// At CRI 50: multiplier = 2.0Γ (double failure rate)
// At CRI 100: multiplier = 3.0Γ (triple β cascading failure)
}
Stressed systems fail more. This isn't a bug β it's physics. A colony running on fumes has less margin for everything. The CRI creates a positive feedback loop (bad β worse β catastrophic) that the reflex arcs (Pattern 03) must counteract. The competition between CRI-driven cascade and reflex-driven recovery IS the game.
Each task template has a trigger function that reads the echo frame. The trigger returns true when conditions justify the task, then riskRoll() applies CRI-weighted probability.
// Task template with echo-driven trigger
{
trigger: (state, echo) => {
// Only fire during dust conditions with declining solar
const dusty = marsWeather.dustTau > 0.4;
const solarDeclining = echo.inertia?.power_velocity < -5;
return dusty && solarDeclining && riskRoll(0.15);
},
gen: (state) => ({
id: 'solar_tracking_fault',
title: 'Solar Array Misalignment',
body: 'Dust accumulation causing tracking error...',
data: `Misalignment: ${misalign}Β° | Power loss: ~${powerLoss}%`,
approve: { label: 'Manual recalibrate', effect: () => { ... } },
deny: { label: 'Ignore', effect: () => { ... } },
alt: { label: 'Deploy cleaning robot', effect: () => { ... } }
})
}
The trigger reads echo.inertia.power_velocity β a value that only exists because Pattern 02 computed it. Tasks emerge from the interaction of patterns, not from any single system. This makes them feel real because they ARE causally linked to what's happening.
A minimal Lisp interpreter (S-expressions) runs inside the main system. It reads system state through environment variables. It writes allocation decisions back. The policy program can be swapped at runtime β no restart, no recompile.
// LisPy program that governs colony resource allocation
(begin
(define o2_urgent (< o2_days 5))
(define h2o_urgent (< h2o_days 5))
(define power_critical (< power 100))
(if o2_urgent
(begin
(set! isru_alloc 0.80)
(set! greenhouse_alloc 0.10)
(set! heating_alloc 0.10))
(if power_critical
(begin
(set! isru_alloc 0.30)
(set! greenhouse_alloc 0.20)
(set! heating_alloc 0.50))
(begin
(set! isru_alloc 0.40)
(set! greenhouse_alloc 0.35)
(set! heating_alloc 0.25))))
)
The policy is data you can diff, version, and A/B test. Swapping from a survivalist governor to a balanced one is changing a text string, not rewriting code. Users can write their own governors without touching the engine. This is the separation of mechanism (the engine) from policy (the governor).
function riskRoll(baseProb) {
const multiplier = 1 + (colonyRiskIndex / 50);
return Math.random() < (baseProb * multiplier);
}
// CRI 0 β multiplier 1.0Γ β nominal failure rates
// CRI 25 β multiplier 1.5Γ β 50% more failures
// CRI 50 β multiplier 2.0Γ β double failure rate
// CRI 75 β multiplier 2.5Γ β cascade territory
// CRI 100β multiplier 3.0Γ β everything breaks
This creates cascading failure β the most realistic and dangerous property of complex systems. One failure raises CRI, which raises probability of the next failure, which raises CRI further. The only escape is proactive management (reflexes, allocation shifts, emergency protocols) that breaks the cycle.
// Auto-save every N ticks
function autoSaveCheck() {
if (state.sol > 0 && state.sol % AUTOSAVE_INTERVAL === 0 && state.alive) {
const cartridge = serializeCartridge();
// Trim for storage efficiency
cartridge.echoHistory = cartridge.echoHistory.slice(-30);
localStorage.setItem(AUTOSAVE_KEY, JSON.stringify(cartridge));
}
}
// Recovery on page load
function checkAutoSave() {
const saved = localStorage.getItem(AUTOSAVE_KEY);
if (!saved) return;
const cartridge = JSON.parse(saved);
if (!cartridge.alive) { localStorage.removeItem(AUTOSAVE_KEY); return; }
// Show recovery prompt: "Resume from Sol X?"
showRecoveryUI(cartridge);
}
Combined with Pattern 05 (Sim Cartridge), this gives you three tiers of persistence: (1) auto-save to localStorage (automatic, bounded), (2) manual export to .json file (explicit, portable), (3) upload to SimHub leaderboard (public, competitive).
You have a simulation and a physical system. They need to interoperate. Traditionally you build a "bridge" β an adapter layer that translates between virtual and physical APIs. This creates a bottleneck, a translation layer, and a fundamental asymmetry: one side is "real" and the other is "simulated."
Eliminate the bridge by making both sides speak the same protocol:
// A physical Rappter connects to the sim:
channel.postMessage({cmd: 'register_wallet',
payload: {address: 'mars1_physical_rappter', owner: 'Wildhaven (Physical)'}});
// Pushes a LisPy program that blends real sensor data + sim state:
channel.postMessage({cmd: 'push_lispy', payload: {code: `(begin
(define real_temp (sensor "temp_c")) ;; physical hardware
(define sim_temp interior_temp_k) ;; virtual colony
(if (< (min real_temp sim_temp) 268)
(set! heating_alloc 0.50)
(set! heating_alloc 0.20)))`}});
// Tips a virtual agent from the physical world:
channel.postMessage({cmd: 'tip_agent',
payload: {agent: 'OPT-01', amount: 500, memo: 'from the real world'}});
// Same messages. Same chain. Same VM. The atoms don't matter.
The portal isn't a bridge you build. It's a wall you stop building. When both sides speak the same language, the distinction between virtual and physical is just metadata on the wallet address. The chain doesn't care if the sender has atoms or not.
This creates a 5th nervous system layer β the Marketplace β that spans both worlds. The treasury governor (LisPy) distributes tokens to virtual agents AND physical machines through the same protocol. Governance doesn't stop at the sim boundary.
AI agents master fixed environments and stop learning. Benchmarks saturate. Training loops plateau. The agent is "done" but the real world keeps changing.
The environment grows in fidelity through additive versioning. Each version adds hazard types from real-world data without removing or contradicting existing frames. Every strategy is scored across 100 Monte Carlo runs through ALL versions sequentially. Damage accumulates. The gauntlet has no ceiling.
v1 (Foundation): dust storms, equipment failure
β Best strategy: immortal (100% survival)
v2 (Robot Killers): + perchlorate, abrasion, radiation, battery
β Same strategy: 0% survival at Sol 501 (dies Sol 315)
β Agent must write new tools for 6 hazard types
v3 (future): + crew psychology, communication delays
β Robot-only exploit eliminated. Humans required.
β Agent must handle empathy as a variable.
vβ: reality keeps producing data β snowball keeps rolling
The sim isn't getting harder. It's getting more real. Hard has a ceiling. Real doesn't. The agent's training environment converges on the actual world. At the limit, the sim and reality are the same thing β and the agent can handle both.
The sim is just the vehicle. The patterns are the roads.
Build your own roads. Drive your own vehicle.
github.com/kody-w/mars-barn-opus