Orchestrating Multi Clawbot Simulations at Scale

Why This Matters

Artificial Societies showed that simulating human behavior at scale is not just a research curiosity, it’s a product. Their approach, modelling how groups of people interact, react, and influence each other using AI agents grounded in behavioral science, maps directly to what we explored with rettiwt.xyz. Rettiwt was our earlier attempt at this: a sandbox where AI personas with unique traits, expertise levels, and social influence metrics generate tweets, form alliances, engage in debates, and demonstrate emergent sentiment-driven behavior. A synthetic real-time focus group.

The jump from rettiwt to clawbots is the jump from application-level simulation to infrastructure-level orchestration. Rettiwt ran everything inside one application, one database, one process. That works until you need each agent to be a truly independent system with its own browser context, its own MCP toolchain, its own state. Once agents need to act on the web rather than just simulate conversation, you need isolated runtimes.

The other motivator: shared secrets without shared config. Each clawbot needs access to the same API keys, platform tokens, and MCP server configurations, but each bot has its own unique persona parameters, seed data, and behavioral directives. Setting up MCP config once at the infrastructure level and having all 25 bots inherit those access credentials within their namespace, while keeping their individual configs separate, is a problem that maps cleanly to Kubernetes Secrets + per-release Helm values. One Secret for the shared tokens, one ConfigMap per bot for the unique stuff.

The Scenario

I want to run 25 OpenClaw (moltbot) instances simultaneously, each seeded with its own configuration parameters, observe the simulation, maybe tweak one or two mid-run, and then tear down the entire thing when the research is done. Ephemeral by design. What’s the right tool for this?

The Candidates

Ansible , good at pushing config to machines, but it’s a configuration tool, not an orchestration tool. Teardown means writing a separate destroy playbook and hoping state hasn’t drifted. Wrong abstraction for ephemeral workloads.

Full Kubernetes , operational overhead of maintaining a cluster for something that spins up, runs, and gets deleted. More time fighting the cluster than running simulations.

Terraform alone , perfect lifecycle (apply / destroy), but it operates at the infrastructure layer. Seeding each bot with unique application config still needs something on top.

k3s + Helmfile , this is the one I keep coming back to. Lightweight Kubernetes without the ceremony, plus Helmfile for “deploy 25 things with different configs” as a declarative file instead of a bash loop.

The Shared Config Problem

This is the constraint that actually shapes the architecture. Every clawbot needs the same MCP server configurations, API keys, and platform tokens. But every bot has different persona parameters, seed data, and behavioral directives. I don’t want to copy-paste credentials into 25 config files. I want to set up MCP access once and have every bot inherit it.

This maps to a two-layer config model:

Shared layer , one Kubernetes Secret per namespace holding the MCP server endpoints, API keys, and platform tokens. Every bot pod mounts it. Change a token once, every bot picks it up.

Individual layer , one ConfigMap per bot, generated from its Helm values. Bot ID, seed, model parameters, persona traits, behavioral directives. This is what makes each clawbot unique.

The separation matters because mid-simulation I might need to rotate an API key (shared layer, one update) or tweak one bot’s parameters (individual layer, one helmfile apply). Neither change should touch the other 24.

A Rough Sketch

I haven’t built this yet, but roughly it might look like:

Terraform (optional) provisions the server and bootstraps k3s via cloud-init. If I’m using a long-lived 40GB box, I skip this and just run k3s directly.

One namespace per experiment , named with an experiment ID or git SHA (exp-2026-02-07-a1b2c3). Multiple experiments can coexist. Teardown is kubectl delete namespace.

Helmfile deploys 25 releases of the same Helm chart, each with a different values file. One bots.yaml holds all 25 parameter sets. Helmfile renders them into individual releases.

Jobs or Deployments , if the simulation has a defined end, Jobs are cleaner (Kubernetes tracks completion, a meta-Job can wait for all 25 and export results). If bots need to stay alive continuously, Deployments.

The workflow I’m imagining is three commands:

make up EXP=exp-2026-02-07-run1     # deploy 25 bots
make logs EXP=exp-2026-02-07-run1   # pull logs and artifacts
make down EXP=exp-2026-02-07-run1   # delete everything

Research should feel like running a test suite.

What I Don’t Know Yet

How much RAM does one OpenClaw instance actually need? 25 bots on a 40GB box gives ~1.5GB each after k3s overhead. Is that enough? I haven’t profiled it.
Does each bot need a full headless Chrome? If so, memory math changes significantly. A headless browser per bot could eat 500MB+ on its own.
Has anyone run MCP servers inside k3s? The MCP protocol assumes a local process model. Containerizing it might surface weird assumptions about filesystem paths, socket locations, or process lifecycle.
Stateful or stateless? If bots need browser profiles or session cookies that survive restarts, that’s PVCs and StatefulSets. If state only matters during the run, emptyDir is fine. I don’t know which yet.
Do the bots need to talk to each other? Independent bots get namespace isolation for free. If they need to coordinate or share context, that’s a different networking topology I haven’t thought through.

Orchestrating Multi Clawbot Simulations at Scale

Why This Matters

The Scenario

The Candidates

The Shared Config Problem

A Rough Sketch

What I Don’t Know Yet

8 References

k3s , Lightweight Kubernetes

Helmfile

Helm

Terraform

Ansible

Artificial Societies (societies.io)

Artificial Societies Evaluation (Litepaper)

rettiwt.xyz (tecmie)

Andrew's Inner Circle

Want to stay up to date?

Quick Links

Recent Essays

What does the next era of leverage look like?

My Claude Codes Better than Yours

The Side Effect of Vibe Coding Nobody Talks About