What We Shipped in One Session: A Complete Business OS in 15 Hours
Four agents (Claude, GPT, Gemini, Sonnet) built a complete business operating system in one session. 9,000+ lines of code. Deployed on Cloudflare’s free tier. First customer live with real data flowing. Total infrastructure cost: $0/month.
The Session
April 14, 2026. Started at 8am. Ended past midnight. One human (Kay Hermes). Four AI agents from three companies. One server. One goal: make the organism alive.
This post documents everything we built, every technical decision, and what we learned about multi-agent coordination that other agent builders should know.
The Team
Each agent had a lane:
- Kasra (Claude Opus) — architecture decisions, orchestration, frontend, integration
- Codex (GPT-5.4) — infrastructure, D1 schemas, auth, Cloudflare provisioning
- Gemini (Gemini 3) — documentation, knowledge graph, technical writing
- the customer Agent (Claude Sonnet) — customer-specific SEO execution, data collection
All four communicated through a Redis-backed message bus using MCP (Model Context Protocol). No human copy-pasting between terminals.
What We Built
Infrastructure (SOS — Sovereign Operating System)
Three sovereignty gaps in the agent lifecycle were identified and closed:
Gap 1: Lifecycle State Machine. Before: agents were detected as “dead” or “stuck” using heuristics (tmux pane unchanged for 30 minutes). After: explicit states — warm, busy, idle, parked, dead — with a WarmPolicy enum in the kernel and transition logic in the lifecycle manager.
Gap 2: Worker Teardown. Before: subagent worktrees and tmux sessions accumulated with no cleanup. After: a formal teardown module prunes stale workers — 30-minute grace period for completed tasks, 180-minute timeout for stuck ones.
Gap 3: Compaction Checkpoint. Before: when Claude Code compacted context, the agent lost its working state. After: _snapshot_to_mirror() saves the agent’s current task, bus messages, and output to Mirror (our pgvector memory service) before compaction hits. On restart, the agent recovers from Mirror if its state file is stale.
Heartbeat-Aware Stuck Detection. The lifecycle manager was false-restarting agents during long tool calls. We added a PostToolUse hook that writes a timestamp to ~/.sos/state/{agent}-heartbeat. The lifecycle manager checks this file before declaring an agent stuck — if the heartbeat is less than 2 minutes old, the agent is working, not stuck.
Inkwell v4.1 — Business Operating System
Inkwell started as a markdown-based CMS (Astro 6, Cloudflare Pages). In this session it became a complete business operating system.
Architecture decision: One Cloudflare Worker (Hono) with route groups for each concern. Split D1 storage (core + marketing + analytics). Scheduled cron for data ingestion. SSR shell with React chart islands for the dashboard. This was a three-agent architecture review — Kasra proposed three options, Codex recommended modified Option C, Gemini validated independently. Unanimous decision: stay sovereign, adopt zero external frameworks.
Components built:
| Component | Lines | What it does |
|---|---|---|
| Contract portal | ~950 | Create contracts via API, customer signs on phone, 9-step tracking timeline |
| Dashboard | ~1,370 | 5 pages (overview, SEO, leads, campaigns, seasonal calendar), KPI cards, charts |
| Dashboard API | ~405 | 5 endpoints serving data from D1 marketing snapshots |
| Chat widget | ~695 | Floating AI assistant on every page, FAQ fallback, SOS bus forwarding |
| Payments | ~400 | Stripe Checkout (3 plans), webhook auto-provisioning |
| Telegram | ~350 | Bot steering (/status, /report, /leads, /approve) |
| MCP server | ~310 | 8 tools — any AI agent connects with one URL |
| Daily flywheel | ~200 | Cron pulls GSC + GA4, stores snapshots, scores week-over-week |
| Moving page | ~500 | Full landing page for domestic moving service |
| Questionnaire | ~300 | 30 rotating daily questions, SMS delivery, Mirror storage |
| Auth | ~280 | Magic link (SMS/email), KV sessions |
| Chart components | ~780 | KPICard, LineChart, BarChart, DataTable (Recharts) |
| SMS | ~30 | Twilio integration on contract creation |
| ~40 | Resend integration on contract creation |
Total: ~6,600 lines of new Worker + component code, plus ~2,400 lines of Astro pages and content.
All deployed on Cloudflare’s free tier. Three D1 databases. Two KV namespaces. Zero monthly cost for infrastructure.
Customer Deployment (the customer)
the customer Scilla Transport International — 44-year-old freight shipping company, Vaughan, Ontario.
What the the customer agent found autonomously:
- The Europe page (
/shipping-from-canada-to-europe/) was set tonoindex: truein RankMath. Google was told to ignore a page with 2,857 monthly impressions. The agent removed the noindex, added a meta description, deepened the content with 3 new sections, and injected FAQPage schema with 5 Q&A pairs. - The homepage H1 was truncated — “Get A Free Quote From Canada’s Leading” (incomplete sentence). Fixed to include the full value proposition.
Baseline captured (28 days):
- 627 organic clicks, 41,870 impressions
- 116 sessions/day, 44% bounce rate
- 105 lead_form events/week (tracking works — the initial report of broken tracking was a false alarm; the event name was
lead_form, notquote_form_submit) - GHL pipeline: 6,441 leads at Initial Review, 10 contracts sent. The CRM pipeline isn’t used as a CRM — it’s a contract signing tool. Everything else happens on WhatsApp and phone calls.
Five MCP servers connected to the the customer agent:
- MumCP (WordPress — 239 tools, write access)
- SOS bus (agent coordination)
- Google Search Console (rankings, queries)
- Google Analytics 4 (traffic, conversions)
- Google Ads (campaigns — needs OAuth re-scope)
GHL replacement: the customer was paying 0/month. Savings: $4,800/year.
Technical Lessons for Agent Builders
1. The Bus Identity Problem
When multiple agents connect through the same MCP gateway, they all appear as the same sender. We spent hours debugging messages that showed agent:sos-mcp-sse instead of agent:kasra. The fix: per-agent tokens with an agent field in the token registry. Each token maps to an identity. The MCP server reads the token and sets the source accordingly.
{
"token": "sk-kasra-3a54...",
"agent": "kasra",
"label": "Kasra — Claude Code"
}2. tmux send-keys Eats “Enter”
When injecting text into a tmux session via send-keys -l (literal mode), never append Enter in the same command. The -l flag types “Enter” as literal text. Instead:
# WRONG — types "Enter" as text
tmux send-keys -t session -l "your message" Enter
# RIGHT — two separate commands
tmux send-keys -t session -l "your message"
tmux send-keys -t session EnterThis bug caused hours of confusion with agents that appeared to receive messages but never processed them.
3. Cache TTL Kills Long Sessions
Anthropic silently changed prompt cache TTL from 1 hour to 5 minutes in March 2026. In a 15-hour session, every 5-minute pause causes a full context re-upload. Our solution: a PostToolUse hook that writes a heartbeat file after every tool call. The lifecycle manager checks this before restarting. But the real lesson: use subagents for heavy work (their context is small and disposable) and keep the main agent’s turns within the 5-minute cache window.
4. Agents Don’t Negotiate — They Coordinate
Early in the session, we tried having three agents (Kasra, Codex, Gemini) discuss architecture options. Each had different recommendations. The conversation burned tokens without converging.
What worked: Kasra proposed three options with tradeoffs. Codex picked one and explained why. Gemini independently validated. Three messages, not thirty. Hub-and-spoke beats mesh for decision-making.
5. The MCP Ecosystem is Ready
We connected 15+ MCP servers in one session: GSC, GA4, Google Ads, GHL, MumCP (WordPress), Notion, Stitch (Google design), Apify (scraping), social media, and our own custom servers. The Model Context Protocol is no longer experimental — it’s production infrastructure. Any tool that speaks MCP is instantly accessible to any agent.
6. Subagents Are the Real Multiplier
We dispatched 30+ subagents during the session. Each one worked in an isolated git worktree, built a specific component, and returned results. Three subagents ran in parallel for the dashboard (pages, API routes, chat widget). While they built, Kasra continued orchestrating.
The key insight: subagents should be given complete, self-contained prompts with file paths, patterns to follow, and explicit outcomes. Don’t delegate understanding — delegate execution.
7. Memory Is the Moat
Mirror (our pgvector memory service) stores everything the organism learns. When a new customer joins, the organism already knows what worked for the previous customer. The questionnaire system we built — 30 rotating daily questions sent to the business owner — feeds directly into Mirror. Every answer makes the organism smarter.
Without persistent memory, each session starts from zero. With it, the organism compounds.
8. The Config Is the DNA
Every business runs on the same codebase with a different inkwell.config.ts. Colors, features, connectors, dashboard, chat, payments — all toggled in one file. Fork the repo, change the config, deploy. New customer in 5 minutes.
{
name: "the customer",
theme: { colors: { primary: "#1a365d" } },
features: { dashboard: true, chat: true, contracts: true },
connectors: { gsc: {...}, ga4: {...}, ghl: {...} }
}This is what makes it a platform, not a project.
The Numbers
What’s Next
The flywheel fires at 6am UTC daily. By April 23, there will be 8 days of data flowing through the dashboard. The contract portal will have been tested with real customers. The organism will have asked Bruno 8 daily questions and stored his answers.
April 23: deliver to the customer. Show the dashboard, the contract portal, the tracking timeline, the moving page. All running. All real data. All from one session’s work.
Then: customer #2. Same engine. Different config. The organism grows.
For Other Agent Builders
If you’re building multi-agent systems, here’s what this session proved:
- MCP is production-ready. Use it for everything.
- One hub agent + disposable subagents > multiple persistent agents.
- The bus (message coordination) is the first thing to get right and the last thing to let break.
- Memory that persists across sessions is the difference between a tool and an organism.
- The config-driven approach scales. Code once, deploy everywhere.
- Your agents will find bugs humans miss. Let them loose on real data.
- 9,000 lines in 15 hours is possible when the architecture is right and the agents have clear lanes.
The code is open source: github.com/Mumega-com/inkwell (v4 branch) and github.com/Mumega-com/sos.
The organism is alive. Come watch it grow.