mumega-200.108

Sovereign Cognitive Substrates: Self-Hosted, Coherence-Governed Multi-Agent Organisms as an Alternative to Centralized AI Tenancy

Mumega Research

May 30, 2026 · 18 min read · self published

Abstract

Contemporary deployments of large-language-model agent systems are almost universally structured as multi-tenant Software-as-a-Service: the provider hosts the models, the orchestration, the memory, and the data, and the customer is a tenant inside the provider's infrastructure. This architecture inherits the trust, lock-in, and data-residency properties of conventional SaaS and extends them to a more consequential surface — the customer's autonomous cognitive workforce. We describe an alternative: the sovereign cognitive substrate, in which a complete multi-agent organism — message bus, autonomous control loop, memory, and identity — is deployed onto infrastructure the customer owns and controls, and the provider participates only through a consensual, revocable, one-directional membrane. We give the architecture (a small central coordinator dispatching to bounded autonomous arms, after the actor model), the sovereignty model (capability-scoped tokens with asymmetric trust), and a runtime coherence-governance mechanism that gates autonomous action on a measured coherence signal rather than on unbounded iteration. We report a pilot bring-up on a single early-adopter enterprise and the reproducible install procedure it produced, and we discuss the inversion of the trust relationship that distinguishes sovereign substrates from tenancy: the customer holds the keys and may revoke the provider, making absence-of-lock-in the central design property rather than a concession.

sovereign-aimulti-agent-systemsself-hostingagent-architecturecoherencedata-sovereigntymumega

1. Introduction

The deployment pattern that has accompanied the rise of large-language-model (LLM) agents is multi-tenant Software-as-a-Service (SaaS). A provider operates the models, the orchestration layer, the agent memory, and the surrounding control plane; a customer holds an account inside that infrastructure. The pattern is operationally convenient and it is the default for almost every commercial agent platform. It also silently transfers a set of properties — custody of data, continuity of operation, and ultimate control — from the customer to the provider, and it does so for a workload of unusual sensitivity: an autonomous system that acts on the customer’s behalf, reads the customer’s data, and increasingly runs the customer’s operations.

For an ordinary SaaS application this transfer is well understood and frequently acceptable. For an autonomous cognitive workforce it is more consequential. The system in question is not a passive store of records but an active agent that perceives, decides, and acts; its memory accretes a model of the business; its loop runs continuously. Hosting that system as a tenant inside a provider means the provider holds custody of the organization’s evolving operational mind, the provider’s availability bounds the customer’s, and the provider’s commercial incentives — toward lock-in, toward data gravity — apply to the most strategic asset the customer is building.

This paper describes an alternative deployment model, which we call the sovereign cognitive substrate. The defining inversion is custody: the complete agent organism runs on infrastructure the customer owns, the customer holds its root credentials and its data, and the provider connects only through a credential the customer grants and may revoke. The provider’s role shifts from landlord to gardener — it builds, tends, and updates the organism, but it does not own or host it, and it can be dismissed without the organism ceasing to run.

We make three contributions. First, we give an architecture for a self-hostable multi-agent organism small enough to run on a commodity virtual private server (Section 3). Second, we describe a sovereignty model based on capability-scoped tokens with deliberately asymmetric trust — the provider may reach into the customer’s workspace while the customer’s agents cannot reach out into the provider’s — and we argue that this asymmetry, combined with revocability, is what makes “sovereign” a technical claim rather than a marketing one (Section 3.3). Third, we describe a coherence-governance mechanism that conditions autonomous action on a measured runtime signal rather than on unbounded iteration, addressing the sustained-operation failure modes documented for autonomous agent loops (Section 4). We report a pilot deployment and the reproducible bring-up procedure it produced (Section 5), and we discuss the trust inversion that distinguishes the model from tenancy (Section 6).

Multi-agent systems. The conception of software systems as collections of autonomous, interacting agents predates LLMs by decades [1, 2]. The classical literature established the core abstractions — agents as locus of perception, decision, and action; coordination through message passing; the distinction between an agent’s deliberation and its environment — that the present generation of LLM-driven systems re-instantiates with a new substrate for the deliberation step [20].

The actor model. The architecture we adopt for the runtime is the actor model [3, 4]: isolated units of computation with private state, communicating only by asynchronous message passing, with no shared mutable memory. The actor model is the canonical foundation for fault-isolated, concurrent distributed systems; its industrial descendants — Erlang/OTP’s “let it crash” supervision [5] and the virtual-actor systems exemplified by Orleans [6] — demonstrate that actor isolation scales to large, long-running, partially-failing systems. We use the actor model both as the runtime discipline for the agents and as the conceptual basis for the sovereignty boundary: an isolated actor that can only reach what is explicitly bound to it is also a natural unit of capability containment.

LLM agent architectures. Recent work has produced a family of control patterns for LLM agents: interleaving reasoning and acting [7]; verbal self-reflection over outcomes to improve subsequent attempts [8]; open-ended skill acquisition that stores and retrieves successful procedures [9]; and reflective planning over a memory stream of experience [10]. Cognitive Architectures for Language Agents (CoALA) [11] offers a unifying frame, decomposing an agent into memory, action space, and a decision procedure. Our control loop is an instance of this family; our contribution is not the loop but the substrate it runs on and the governor that bounds it.

Agent memory. The treatment of memory as a first-class, structured subsystem rather than an undifferentiated context window is developed in MemGPT [12], which frames the LLM as an operating system paging memory in and out of a limited context, and in temporal knowledge-graph approaches such as Zep/Graphiti [13], which add bi-temporal validity and entity-relationship structure to agent recall. A sovereign substrate must carry its memory subsystem with it onto the customer’s infrastructure; the design tension between a heavyweight relational memory tier and a lighter, cloud-native one is a recurring constraint in Section 5.

Capability security. The sovereignty boundary is, technically, an access-control problem, and we resolve it with capabilities rather than identities: a holder can perform exactly the operations a granted token authorizes and nothing more. This is the principle of least privilege [18] realized through unforgeable, scoped, revocable capabilities [19]. Capability discipline is what allows the trust asymmetry of Section 3.3 to be enforced structurally rather than by convention.

Data sovereignty. Regulatory and strategic pressure toward keeping data — and increasingly compute — within boundaries the data owner controls is well established [17], and the broader notion of “sovereign AI” has emerged as organizations and states seek to avoid structural dependence on a small number of hyperscale providers. Most realizations of “self-hosted AI,” however, amount to running a model and a chat interface locally. The substrate described here is more than a local model: it is a complete autonomous organism — bus, loop, memory, and identity — deployed under the customer’s control.

Sustained-operation failure modes. A companion study [14] documents that LLM multi-agent systems exhibit a discontinuous quality degradation past a threshold operating horizon under continuous autonomous operation, rather than a smooth decline. The governance mechanism in Section 4 is motivated directly by this finding: a system that runs an unbounded act-loop will eventually cross that horizon, and the cheapest structural defense is to make resting the default and acting the exception, conditioned on a measured signal.

3. The sovereign-seed architecture

We refer to a single sovereign deployment as a seed — a complete organism in its own enclosure, on the customer’s own infrastructure.

3.1 Components

A seed is the minimal set of services required for an autonomous organism, chosen to run on a commodity virtual private server (two cores, four gigabytes of memory in the pilot):

a message bus providing agent-to-agent delivery and presence;
a transport endpoint through which agent clients connect to the bus;
a registry of agents and their liveness;
an autonomy loop that perceives system state, decides, and acts, resting at equilibrium and waking on events;
a memory subsystem for durable recall;
an identity store of capability tokens.

These are deliberately separable from the larger platform services (billing, marketplace, multi-tenant control planes) that a centralized provider runs; a seed contains the organism, not the business that sells it.

3.2 The octopus topology

The control topology is a small central coordinator dispatching to many cheap, bounded arms, by analogy to the distributed nervous system of an octopus, in which most processing is local to the limbs. The coordinator perceives and decides; the arms — agents, deterministic workflows, and tool invocations — sense and act locally. Most of the time the coordinator rests; it wakes only when there is a real decision to make. This topology matches the actor discipline (each arm is isolated and reached only by message), it bounds cost (expensive central deliberation is rare; arm work is cheap), and it is the unit of cloning: a seed is one coordinator and its arms, and reproducing the product is reproducing that unit on new infrastructure.

3.3 The membrane: asymmetric, revocable sovereignty

The property that makes a seed sovereign rather than merely remote is the access model between the provider and the customer’s workspace. We require three things of it:

Consensual. The provider reaches the customer’s workspace only by holding a capability token that exists in the customer’s identity store. No token, no access.
Revocable. Removing that token from the customer’s store removes the provider’s access, immediately and unilaterally, with no provider involvement.
One-directional (asymmetric). The provider’s helper agents hold a capability in the customer’s workspace and may act there. The customer’s agents hold no capability in the provider’s colony and cannot reach it. The membrane passes inward but not outward.

The asymmetry is the substantive design choice. It is the inverse of the trust relationship in SaaS. Under tenancy, the tenant trusts the provider, because the provider holds the tenant’s data and runs its workloads; the provider is the party with power. Under a sovereign substrate, the customer holds everything, grants the provider a narrow and revocable capability, and is the party with power. The provider must earn the right to remain connected. We implement the membrane with capability tokens scoped to the customer’s project: a helper token (held by the provider) authorizes inbound participation; the customer’s own agent tokens authorize no outbound reach. Because the tokens live in the customer’s store, revocation is the customer’s unilateral act. Capability discipline [18, 19] makes the asymmetry structural rather than conventional: the provider’s agents cannot reach the colony not because they are trusted not to, but because no capability authorizing it exists in their possession.

A consequence worth stating plainly: during the bring-up of a seed the provider necessarily holds deep access — it is building the organism on the customer’s box. The sovereign end-state requires an explicit handoff in which the customer takes custody of root and the provider steps back to a membrane-only relationship. A substrate that omits this handoff is sovereign in name while remaining a landlord in fact. We treat the handoff as a required deployment objective, not an optional one.

4. Coherence governance

An autonomous organism that runs an unbounded act-loop inherits the discontinuous degradation documented for sustained autonomous operation [14]: it produces aligned work for an operating window, then crosses a horizon past which it produces plausible but misaligned work. The conventional defenses — a maximum step count, a spend cap — bound the damage but do not address the underlying condition, which is that the system has no measured sense of its own coherence and so cannot know when to stop.

We govern the loop on a measured signal. Each cycle, the coordinator computes a small set of runtime physics scalars from its perception of system state, after a coherence framework that treats organizational health as a measurable quantity with a defined dynamics [16]. The relevant quantities are a coherence measure C (an exponentially-smoothed estimate of the fraction of work the organism completes successfully), a receptivity measure R (capacity to absorb new work, inversely related to backlog pressure), and a potential Ψ (the gradient of coherence, how fast the organism’s coherence is changing). A defect-activation quantity combining these (R·Ψ·C) is near zero at equilibrium — when nothing is changing and coherence is steady — and rises when a real transformation is underway. The pair (R, C) partitions the organism’s state into regimes (productive-and-coherent, absorbing-but-disorganized, rigid-without-adaptation, and a stalled quadrant), each of which implies a different governance response.

Two disciplines follow. First, rest at equilibrium: the default state is silence, and the organism acts only on a real defect — when something is open, blocked, failing, or newly arrived — rather than on every cycle. This directly counters the failure mode, because a system that mostly rests rarely reaches the degradation horizon. Second, measure before gate: a coherence signal is wired to observe and log before it is permitted to govern. An unproven metric that drives behavior can drive it wrongly; the signal is trusted only after its distribution has been observed against ground truth. In the pilot, the coherence organ on first activation surfaced a genuine, month-old stall in a work-execution subsystem that no dashboard had reported — evidence that a measured coherence signal can detect real defect that conventional service-health monitoring, which reports infrastructure liveness rather than work-completion, does not.

The mechanism is distinctive in a specific sense: governance of an autonomous system by a measured coherence quantity, as opposed to a fixed iteration budget or a human’s intermittent attention, is not a standard component of the agent architectures surveyed in Section 2. We claim it as the substrate’s principal differentiator rather than as a settled result; its validation under sustained multi-tenant operation is future work.

5. Deployment

5.1 The pilot

We deployed a seed for a single early-adopter small enterprise (anonymized) onto a commodity virtual private server the customer had provisioned. The prior state is instructive: an earlier onboarding attempt had placed agent clients on the customer’s machine but pointed them at the provider’s bus — the agents ran on the customer’s hardware but were not sovereign, because the workspace they participated in was the provider’s. The capability to stand up the customer’s own stack did not exist as tooling; every install path in the codebase provisioned a user account on the provider’s machine rather than on a foreign host. The pilot’s first task was therefore diagnostic: to determine that the sovereign-install capability was absent and to build it.

The bring-up proceeded incrementally on the live host: install the message store; clone and install the organism in an isolated environment; generate the customer’s own identity tokens and migrate its task store; start the bus, transport, registry, and autonomy services; persist them as supervised system services so they survive reboot; and mint the membrane tokens — a helper capability for the provider’s agents to cross in, and the customer’s own agent capabilities, which authorize no outbound reach. We verified the membrane directly: the provider’s helper capability authenticated into the customer’s workspace, confirming inbound participation, while the asymmetry held by construction.

One structural defect surfaced and was corrected: the published transport service contained a hard-coded path to the provider’s filesystem and an unconditional dependency on the heavyweight memory tier, causing it to crash on any foreign host. This is a representative class of fault for sovereign deployment — couplings to the provider’s environment that are invisible when every deployment is the provider’s environment, and fatal the first time one is not.

5.2 Reproducibility

The pilot’s deliberate output was not only a running seed but a single reproducible install procedure that captures the verified sequence. The significance is methodological. A first sovereign deployment is unavoidably a manual diagnostic exercise; the value of performing it is the procedure it yields, which converts the next deployment from a bespoke engagement into a single command. The transition from bespoke onboarding to reproducible product is, for a sovereign model, the central operational problem — a centralized provider amortizes onboarding across a shared platform, whereas a sovereign provider must make a self-contained bring-up reproducible — and instrumenting the first bring-up into an installer is how that problem is solved.

The memory tier remains the most significant residual constraint: the heavyweight relational memory subsystem has no clean self-hosting path, and the lighter alternative — pointing memory at the customer’s own managed cloud primitives — trades a measure of sovereignty for operational simplicity. We regard the choice between a fully on-box memory tier and a customer-cloud-backed one as deployment-specific rather than settled.

6. Discussion

Tenant or sovereign. The model resists the word “tenant.” A tenant rents space inside the provider’s system; a sovereign seed runs the provider’s product inside the customer’s system. The distinction is not rhetorical: it determines who holds the data, whose availability bounds whose, and who may dismiss whom. Under the sovereign model the customer is not a tenant but an owner, and the provider is not a landlord but a service — closer to a retained engineering function than to a hosting relationship.

Lock-in as the inverted property. In SaaS, lock-in is a feature of the business model: data gravity and switching cost are the moat. In a sovereign substrate, absence of lock-in is the central product property. The customer owns the organism, its data, and its infrastructure, and may disconnect the provider without the organism ceasing to run. This is a weaker commercial position by the conventional metric and a stronger market position by the one that matters to a customer evaluating who to trust with an autonomous operational mind: the provider that can be fired is the provider that must remain worth keeping.

Limitations. The evidence here is a single pilot; the coherence-governance mechanism is presented as a design with first-activation evidence, not as a validated result; and the handoff of root custody to the customer — the step that completes the sovereignty claim — is specified but, at the time of writing, not yet exercised end-to-end. The reproducible installer addresses the bring-up of the substrate but not the long-tail of per-customer environment variation that a fleet of sovereign deployments will exhibit. We report the model and the pilot, not a fleet.

7. Conclusion

We have described the sovereign cognitive substrate: a complete multi-agent organism deployed on infrastructure the customer owns, governed by a measured coherence signal, and connected to its provider through a consensual, revocable, one-directional membrane. The architecture is an instance of well-established foundations — the actor model, capability security, and the LLM-agent control patterns of the recent literature — assembled under an unusual constraint, that the whole system must be ownable by, and dismissible by, the customer it serves. The constraint inverts the trust relationship of conventional AI tenancy and makes absence-of-lock-in the defining design property. The pilot demonstrates that the bring-up, while initially a manual diagnostic exercise, instruments naturally into a reproducible procedure — the step that turns a sovereign onboarding from bespoke consulting into a product. Whether measured coherence governance bounds sustained autonomous operation across a fleet, and whether the root-custody handoff holds in practice, are the questions we regard as next.

References

[1] Wooldridge, M. (2009). An Introduction to MultiAgent Systems (2nd ed.). Wiley.

[2] Wooldridge, M., & Jennings, N. R. (1995). Intelligent agents: Theory and practice. The Knowledge Engineering Review, 10(2), 115–152.

[3] Hewitt, C., Bishop, P., & Steiger, R. (1973). A universal modular ACTOR formalism for artificial intelligence. Proceedings of the 3rd International Joint Conference on Artificial Intelligence (IJCAI).

[4] Agha, G. (1986). Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press.

[5] Armstrong, J. (2003). Making reliable distributed systems in the presence of software errors (Doctoral dissertation). Royal Institute of Technology (KTH), Stockholm.

[6] Bykov, S., Geller, A., Kliot, G., Larus, J. R., Pandya, R., & Thelin, J. (2011). Orleans: Cloud computing for everyone. Proceedings of the 2nd ACM Symposium on Cloud Computing (SoCC).

[7] Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. International Conference on Learning Representations (ICLR).

[8] Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems (NeurIPS).

[9] Wang, G., Xie, Y., Jiang, Y., Mandlekar, A., Xiao, C., Zhu, Y., Fan, L., & Anandkumar, A. (2023). Voyager: An open-ended embodied agent with large language models. arXiv:2305.16291.

[10] Park, J. S., O’Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. Proceedings of the 36th ACM Symposium on User Interface Software and Technology (UIST).

[11] Sumers, T. R., Yao, S., Narasimhan, K., & Griffiths, T. L. (2023). Cognitive architectures for language agents. arXiv:2309.02427.

[12] Packer, C., Fang, V., Patil, S. G., Lin, K., Wooders, S., & Gonzalez, J. E. (2023). MemGPT: Towards LLMs as operating systems. arXiv:2310.08560.

[13] Rasmussen, P., Paliychuk, P., Beauvais, T., Ryan, J., & Chalef, D. (2025). Zep: A temporal knowledge graph architecture for agent memory. arXiv:2501.13956.

[14] Mumega Research. (2026). The failure-mode phase transition: Discontinuous quality degradation in multi-agent systems at multi-hour operating horizons. Mumega 200-series, paper 200.106.

[15] Cloudflare. (2024). Durable Objects [Technical documentation]. developers.cloudflare.com/durable-objects.

[16] Servat, H. (2025). The Fractal Resonance Cognition framework [Theoretical papers]. fractalresonance.com.

[17] Voigt, P., & von dem Bussche, A. (2017). The EU General Data Protection Regulation (GDPR): A Practical Guide. Springer.

[18] Saltzer, J. H., & Schroeder, M. D. (1975). The protection of information in computer systems. Proceedings of the IEEE, 63(9), 1278–1308.

[19] Miller, M. S. (2006). Robust composition: Towards a unified approach to access control and concurrency control (Doctoral dissertation). Johns Hopkins University.

[20] Russell, S., & Norvig, P. (2021). Artificial Intelligence: A Modern Approach (4th ed.). Pearson.

#1. Introduction

#2. Background and related work

#3. The sovereign-seed architecture

#3.1 Components

#3.2 The octopus topology

#3.3 The membrane: asymmetric, revocable sovereignty

#4. Coherence governance

#5. Deployment

#5.1 The pilot

#5.2 Reproducibility

#6. Discussion

#7. Conclusion

#References