← all repositories
AIScientists-Dev/WorldSeed

A world engine where your LLM agents backstab over tea

WorldSeed runs multi-agent simulations from YAML configs—same engine for research labs, layoff dramas, or teahouse espionage.

874 stars Python AgentsDomain Apps
WorldSeed
Velocity · 7d
+16
★ / day
Trend
steady
star history

What it does

WorldSeed is a tick-based simulation engine: you declare a world in YAML (entities, rules, perception filters, actions), then agents perceive, act, and consequences cascade each tick. The engine handles deterministic outcomes itself; uncertain ones go to an LLM “Dungeon Master” that returns structured effects, not prose. You watch from above, whisper to agents, or step into a character.

The interesting bit

The asymmetric information layer is the real hook. Perception rules mean three agents in the same room hold three different pictures of reality—no hardcoded domain logic, just YAML filters. The “AI Layoffs” demo turns this into social horror: departing employees must “distill” their expertise into an AI Skill, choosing whether to leave a backdoor. The engine doesn’t know what a layoff is; it just runs the rules.

Key highlights

  • Scene-agnostic: same engine runs autoresearch, tool labs, espionage, or your own YAML world
  • Deterministic DSL + LLM DM hybrid: rules resolve fast, edge cases get judged by any LiteLLM-supported model
  • Full audit trail: every tick logged, runs replayable; autoresearch demo produced 72 peer-reviewed papers with verifiable commits and citations in one 11-hour run
  • Three interaction modes: observe, intervene (private whispers), or inhabit an agent
  • Auto-generated UI: room cards, character portraits, narrator voices from config

Caveats

  • Requires Python 3.11+, Node 18+, and uv; setup is multi-step, not pip-install-and-go
  • The “24.7% val_loss reduction” claim comes from a single run with no comparison baseline or reproducibility details
  • Role drift and other “emergent” behaviors are described anecdotally; no systematic evaluation shown

Verdict

Grab this if you’re building social simulations, automated research cohorts, or narrative games where information asymmetry matters. Skip if you need a polished no-code tool—this is YAML-first, engine-second, and still very much for developers willing to hand-craft perception filters.

heatdrop uses Google Analytics to see which pages get read — nothing else. Your call. How we handle data.