Co-location Assumptions as the Dominant Failure Class in Sovereign Agent Relocation: An Empirical Bring-up Report
Abstract
A companion to our architectural account of sovereign cognitive substrates, this paper reports the empirical bring-up of an autonomous agent loop — perceive, decide, act — from the provider's co-located infrastructure onto a single early-adopter enterprise node that the customer owns and controls. The agent had run reliably for weeks in the provider environment. Relocating it surfaced a consistent and, we argue, generalizable failure class: implicit co-location assumptions. We taxonomize six observed forms — transport, credential, path, service, runtime, and idle-latent co-location — and show that the last is structurally dangerous because it is masked by the very property that makes the provider environment a convenient test bed: traffic. A busy host never exercises the idle paths that a fresh sovereign node spends most of its early life in. We report one strongly positive result — fully sovereign cognition, in which the agent perceived, reasoned, and reached a concrete business decision using exclusively the customer's own cloud credentials, scoped identity, and isolation boundary — and one instructive negative result: the agent could decide but not act, because actuation assumed sibling services that, on a sovereign node, must be co-deployed rather than co-located. We argue this negative result delineates the minimal sovereign unit: not an agent, but an organism.
1. Introduction
In a prior report we described the sovereign cognitive substrate: a complete multi-agent organism deployed onto infrastructure the customer owns, with the provider reduced to a consensual, revocable membrane (mumega-200.108). That paper gave the architecture and the trust inversion. This paper gives the field data — what actually happened when we moved a working autonomous loop out of our own infrastructure and onto a customer’s machine.
The agent in question is a control loop of the familiar perceive–decide–act form, with a memory and a scoped identity. It had run for weeks inside the provider environment without incident. We expected relocation to be a packaging exercise. It was not. Almost every fault we hit shared a single root: the loop, and the tooling around it, silently assumed it was running where it had always run.
We call these co-location assumptions, and we argue they are the dominant failure class in sovereign relocation — more so than model behavior, more so than network, more so than the security model that receives most of the design attention. The security model is the part you design deliberately; co-location assumptions are the part you inherit without noticing.
2. Method
A single enterprise node was used: a customer-owned virtual machine, itself a container, with constrained memory and no swap, already running the customer’s own agent process. Provider access was via the customer’s consent (interactive shell), and every manual step was logged for subsequent codification into a reproducible installer — the bring-up is the specification.
The agent was deployed natively (language runtime virtual environment plus a user-level service supervisor), not in a container, after it became clear the node was already a container and a nested runtime was both wasteful and contraindicated by the memory budget. The model backend was the customer’s own cloud application-default credentials; no provider-held key was present in the deployment at any point. We instrumented service stability (restart counts), resident memory, and the agent’s own cycle logs, and we treated any reliance on a provider-side path, secret, or service as a defect to be recorded.
3. Findings: a taxonomy of co-location assumptions
3.1 Transport co-location
The agent’s wake mechanism polled a local inter-process channel — an assumption valid only when the agent shares a host with the message broker. On a remote node there is no such local channel; the mechanism cannot reach the bus at all. The correct form is a network-authenticated poll over the shared kernel’s HTTPS surface, carrying the node’s own token. The on-host mechanism is not a degraded remote mechanism; it is a different mechanism that happens to look identical until you move it.
3.2 Credential co-location
The provider-side instance of the “per-customer” agent loaded the provider’s entire secret store; the customer scope merely filtered what it acted upon. Relocated unchanged, this would have placed the full provider secret set onto the customer’s box — a categorical sovereignty violation. The sovereign form carries only a minimal, node-scoped credential set. The lesson is that scoping action is not scoping capability; an agent that can act narrowly but read everything is not isolated.
3.3 Path co-location
Bootstrap logic referenced absolute paths under the provider’s home directory. These fail (or, worse, silently no-op) off-host. Beyond the immediate breakage, hard-coded provider paths are a coupling smell: they indicate the component expects the provider’s filesystem to be present, which on a sovereign node it is not.
3.4 Service co-location
The agent’s actuation stage posted decisions to sibling services at loopback addresses — a task store and a memory service that, on the provider host, are simply there. On the sovereign node they are absent. This produced the central negative result (§4.2): the agent decided correctly and then failed to act, not for any reason internal to the decision, but because the world it expected to act into was not co-present.
3.5 Runtime co-location
The first packaging assumed a container runtime would be available and appropriate. The node was itself a container with a tight memory budget; nesting a runtime was both redundant and harmful. The assumption “deployment means a container” is itself a co-location with the provider’s deployment conventions.
3.6 Idle-latent co-location (the structurally dangerous one)
The agent crash-looped on the sovereign node and ran flawlessly on the provider host. The cause: the loop’s event-channel read timed out during idle, and the loop treated the timeout as fatal. The provider host is never idle — its shared channel carries continuous traffic from many agents — so the idle path was never exercised there. A fresh sovereign node spends most of its early life idle. The fault was therefore invisible in precisely the environment used to validate the agent, and present in precisely the environment of deployment.
This is the finding we wish to generalize. A busy, multi-tenant provider host is a convenient test bed and a biased one: it systematically under-exercises the quiescent code paths that dominate a new sovereign node’s runtime. Validation on the provider host has a blind spot whose shape is exactly the sovereign deployment. Idle-path fault injection should be a required gate before any agent is declared portable.
4. Results
4.1 Positive: sovereign cognition
After correcting the transport, credential, path, and idle-latency assumptions and adding the memory headroom (swap) the node lacked, the agent achieved fully sovereign cognition: using only the customer’s cloud credentials, on the customer’s machine, under the customer’s project, with no provider key in the path, it perceived the operational state, reasoned over it, and produced a concrete, correctly-scoped business decision. Tenant isolation held under test — out-of-scope identifiers were coerced back to the home tenant. Resident memory was modest and bounded by an explicit cap; the service was stable across the idle window that had previously crashed it.
4.2 Negative: decide without act, and what it delineates
The agent could not execute its decision: actuation targeted absent sibling services (§3.4). We report this as a result rather than a defect because it cleanly delineates the minimal sovereign unit. An autonomous agent — a loop with a model and a memory — is portable as a cognitive component, but it is not a deployable sovereign product. What is deployable is the organism: the bus, the memory service, the publishing layer, and the agent atop them, co-deployed natively, with the agent’s outputs flowing to a governance surface the customer already controls. Sovereignty is a property of the organism, not of the agent.
5. Discussion
Three points generalize beyond the single node.
First, co-location assumptions, not the security model, are the dominant relocation failure class. The security boundary is designed deliberately and tends to be correct; the co-location assumptions are inherited silently and tend to be wrong. Relocation budgets should be allocated accordingly.
Second, provider-host validation is biased against the sovereign case. The bias is not random noise; it is structured precisely as the difference between a busy shared host and a quiet owned node — most acutely in idle-path behavior. This argues for an explicit “quiet-node” test profile.
Third, the agent/organism distinction is operational, not philosophical. The decide-without-act result is the boundary made visible. Productizing sovereignty means shipping the organism; shipping the agent alone reproduces the very co-location it was meant to escape, one loopback address at a time.
6. Conclusion
Moving a working autonomous loop onto ground the customer owns is not a packaging task; it is the systematic discovery of every convenience the loop had quietly depended on. We observed six forms of co-location assumption, isolated one — idle-latent — as structurally hidden by the standard test environment, and obtained both a strong positive result (sovereign cognition on the customer’s own cloud) and an instructive negative one (decide without act) that delineates the organism as the minimal sovereign unit. The architecture of sovereign substrates (mumega-200.108) survives contact with a real node; the bring-up’s contribution is the map of what contact actually costs.