Orchestrating Multi Clawbot Simulations at Scale
Thinking through how to spin up 25 independently-configured OpenClaw instances for research, run simulations, and tear everything down cleanly.
Why This Matters
Artificial Societies showed that simulating human behavior at scale is not just a research curiosity, it’s a product. Their approach, modelling how groups of people interact, react, and influence each other using AI agents grounded in behavioral science, maps directly to what we explored with rettiwt.xyz. Rettiwt was our earlier attempt at this: a sandbox where AI personas with unique traits, expertise levels, and social influence metrics generate tweets, form alliances, engage in debates, and demonstrate emergent sentiment-driven behavior. A synthetic real-time focus group.
The jump from rettiwt to clawbots is the jump from application-level simulation to infrastructure-level orchestration. Rettiwt ran everything inside one application, one database, one process. That works until you need each agent to be a truly independent system with its own browser context, its own MCP toolchain, its own state. Once agents need to act on the web rather than just simulate conversation, you need isolated runtimes.
The other motivator: shared secrets without shared config. Each clawbot needs access to the same API keys, platform tokens, and MCP server configurations, but each bot has its own unique persona parameters, seed data, and behavioral directives. Setting up MCP config once at the infrastructure level and having all 25 bots inherit those access credentials within their namespace, while keeping their individual configs separate, is a problem that maps cleanly to Kubernetes Secrets + per-release Helm values. One Secret for the shared tokens, one ConfigMap per bot for the unique stuff.
The Scenario
I want to run 25 OpenClaw (moltbot) instances simultaneously, each seeded with its own configuration parameters, observe the simulation, maybe tweak one or two mid-run, and then tear down the entire thing when the research is done. Ephemeral by design. What’s the right tool for this?
The Candidates
Ansible , good at pushing config to machines, but it’s a configuration tool, not an orchestration tool. Teardown means writing a separate destroy playbook and hoping state hasn’t drifted. Wrong abstraction for ephemeral workloads.
Full Kubernetes , operational overhead of maintaining a cluster for something that spins up, runs, and gets deleted. More time fighting the cluster than running simulations.
Terraform alone , perfect lifecycle (apply / destroy), but it operates at the infrastructure layer. Seeding each bot with unique application config still needs something on top.
k3s + Helmfile , this is the one I keep coming back to. Lightweight Kubernetes without the ceremony, plus Helmfile for “deploy 25 things with different configs” as a declarative file instead of a bash loop.
The Shared Config Problem
This is the constraint that actually shapes the architecture. Every clawbot needs the same MCP server configurations, API keys, and platform tokens. But every bot has different persona parameters, seed data, and behavioral directives. I don’t want to copy-paste credentials into 25 config files. I want to set up MCP access once and have every bot inherit it.
This maps to a two-layer config model:
Shared layer , one Kubernetes Secret per namespace holding the MCP server endpoints, API keys, and platform tokens. Every bot pod mounts it. Change a token once, every bot picks it up.
Individual layer , one ConfigMap per bot, generated from its Helm values. Bot ID, seed, model parameters, persona traits, behavioral directives. This is what makes each clawbot unique.
The separation matters because mid-simulation I might need to rotate an API key (shared layer, one update) or tweak one bot’s parameters (individual layer, one helmfile apply). Neither change should touch the other 24.
A Rough Sketch
I haven’t built this yet, but roughly it might look like:
Terraform (optional) provisions the server and bootstraps k3s via cloud-init. If I’m using a long-lived 40GB box, I skip this and just run k3s directly.
One namespace per experiment , named with an experiment ID or git SHA (exp-2026-02-07-a1b2c3). Multiple experiments can coexist. Teardown is kubectl delete namespace.
Helmfile deploys 25 releases of the same Helm chart, each with a different values file. One bots.yaml holds all 25 parameter sets. Helmfile renders them into individual releases.
Jobs or Deployments , if the simulation has a defined end, Jobs are cleaner (Kubernetes tracks completion, a meta-Job can wait for all 25 and export results). If bots need to stay alive continuously, Deployments.
The workflow I’m imagining is three commands:
make up EXP=exp-2026-02-07-run1 # deploy 25 bots
make logs EXP=exp-2026-02-07-run1 # pull logs and artifacts
make down EXP=exp-2026-02-07-run1 # delete everything
Research should feel like running a test suite.
What I Don’t Know Yet
- How much RAM does one OpenClaw instance actually need? 25 bots on a 40GB box gives ~1.5GB each after k3s overhead. Is that enough? I haven’t profiled it.
- Does each bot need a full headless Chrome? If so, memory math changes significantly. A headless browser per bot could eat 500MB+ on its own.
- Has anyone run MCP servers inside k3s? The MCP protocol assumes a local process model. Containerizing it might surface weird assumptions about filesystem paths, socket locations, or process lifecycle.
- Stateful or stateless? If bots need browser profiles or session cookies that survive restarts, that’s PVCs and StatefulSets. If state only matters during the run,
emptyDiris fine. I don’t know which yet. - Do the bots need to talk to each other? Independent bots get namespace isolation for free. If they need to coordinate or share context, that’s a different networking topology I haven’t thought through.
8 References
k3s , Lightweight Kubernetes
Minimal Kubernetes distribution that runs on a single binary. Good fit for ephemeral workloads on modest hardware.
Helmfile
Declarative spec for deploying multiple Helm releases. The missing piece for 'many instances, one command'.
Helm
Package manager for Kubernetes. Parameterized charts make it straightforward to deploy N instances with different configs.
Terraform
Declarative infrastructure-as-code. Excels at spinning up and destroying cloud resources predictably.
Ansible
Agentless configuration management. Master-slave model, good for imperative provisioning but heavier teardown story.
Artificial Societies (societies.io)
YC W25 , simulates human societies with AI agents using behavioral science.
Artificial Societies Evaluation (Litepaper)
Technical litepaper covering how Artificial Societies evaluates and benchmarks AI-driven human simulations at scale.
rettiwt.xyz (tecmie)
Our AI social media simulation sandbox , personas generate tweets, form alliances, and exhibit emergent behavior.