Architecture Overview¶
Duplicated Docs and Diagrams
Unify with distributed.md.
Colony is a cache-aware multi-agent framework for reasoning over extremely long context (billion-token scale) without RAG. Instead of retrieving fragments, Colony keeps the entire context "live" through paged distributed KV caching across a GPU cluster, allowing agents to perform deep, iterative reasoning over the full input.
Core Design Principles¶
-
Explicit context over implicit context. LLMs struggle with knowledge buried in training data. Deep reasoning requires explicating implicit context into explicit, in-context material.
-
The LLM is the planner, not the framework. Agent control flow and all decisions are driven by a reasoning LLM given sufficient context, not hardcoded logic. The framework provides context, asks "what next?", executes, and feeds back results.
-
Policy-based design. Every cognitive process is a pluggable policy with well-defined interfaces and default implementations. Policies compose hierarchically.
-
All state on the blackboard. No out-of-band state in instance variables. State changes become observable events via blackboard notifications, enabling the memory-as-observer pattern.
-
Cache-awareness is emergent. Cache efficiency is not a property of individual agents but a collective property of the multi-agent system as a whole.
System Architecture¶
Add More Details and Shown Design Options
This digram does not convey how flexible the architecture is. For example, the memory system is not constrained to the specific hierarchy shown here -- users can design arbitrary memory hierarchies with custom scopes and maintenance policies. The VCM is not constrained to specific page sources. The agent system supports both VCM-bound agents and unbound (floating) agents that operate on blackboard state without direct page bindings.
graph TB
subgraph Cluster["GPU Cluster"]
VCM["Virtual Context Memory<br/>(VCM)"]
vLLM["vLLM Replicas"]
end
subgraph AgentSystem["Agent System"]
Agent["Agent<br/>(base.Agent)"]
AP["ActionPolicy<br/>(CacheAwareActionPolicy)"]
Cap["AgentCapabilities"]
Hooks["Hook System<br/>(AOP)"]
end
subgraph Memory["Memory System"]
BB["EnhancedBlackboard<br/>(Redis-backed)"]
WM["Working Memory"]
STM["Short-Term Memory"]
LTM["Long-Term Memory"]
end
subgraph Games["Game Engine"]
GP["GameProtocolCapability"]
Roles["Proposer / Skeptic /<br/>Grounder / Arbiter"]
end
Agent --> AP
Agent --> Cap
Agent --> Hooks
AP --> Cap
Cap --> BB
Cap --> VCM
VCM --> vLLM
Agent --> BB
WM --> STM
STM --> LTM
BB --- WM
BB --- STM
BB --- LTM
GP --> BB
GP --> Roles
Subsystems¶
Distributed Architecture¶
Colony is natively distributed. This section covers Colony's serving framework, request routing, multi-tenancy, autoscaling, and the heterogeneous LLM cluster.
Virtual Context Memory¶
The VCM manages context pages like an OS manages virtual memory with page tables, page faults, and cache-aware scheduling. It operates at the cluster level (across GPU nodes), unlike vLLM which is node-level. Extended VCM combines immutable read-only input pages with read-write blackboard output.
Agent System¶
Agents are autonomous computational entities with lifecycle states, capabilities, and action policies. The framework supports VCM-bound agents (loaded/unloaded with pages), unbound agents, service agents, and supervisor agents. AgentCapability provides the extension point -- each capability is an AOP aspect, and the ActionPolicy acts as the aspect weaver.
Agent Memory System¶
A unified memory architecture where all state lives in blackboards. Memory is organized hierarchically -- sensory, working, short-term, and long-term (episodic, semantic, procedural) -- with each level implemented as a MemoryCapability managing a blackboard scope. Agents reason about their memory, not just with it.
Action Policies¶
The decision-making core. The LLM selects actions through a two-phase process (choose action, then parameterize), executes partial plans in a Model-Predictive Control loop, and revises as conditions change. CacheAwareActionPolicy is the primary implementation, coordinating planning, execution, and replanning.
Blackboard¶
EnhancedBlackboard is the single source of truth for all agent state. Redis-backed, event-driven, with optimistic concurrency control. Policy-based design (access, eviction, validation) without inheritance hierarchies.
Hook System¶
Aspect-oriented programming for cross-cutting concerns. The @hookable decorator marks interception points; Before, After, and Around hooks attach via Pointcut expressions. Used for token tracking, rate limiting, checkpointing, and memory capture without polluting core logic.
Game Patterns¶
A framework for structured multi-agent deliberation. Four game types -- hypothesis, bidding/contract, negotiation, consensus -- with defined roles and an Agent Communication Language. Games serve as error correction mechanisms: hypothesis games combat hallucination, contract nets combat laziness, objective guards combat goal drift.
Training¶
Coming soon.
Key Classes¶
| Class | Module | Role |
|---|---|---|
Agent |
polymathera.colony.agents.base |
Base agent with lifecycle, capabilities, blackboard access |
ActionPolicy |
polymathera.colony.agents.base |
Abstract base for action selection and planning |
CacheAwareActionPolicy |
polymathera.colony.agents.patterns.actions.policies |
Primary policy: MPC-style planning with cache awareness |
AgentCapability |
polymathera.colony.agents.base |
Extension point for agent functionality (AOP aspect) |
EnhancedBlackboard |
polymathera.colony.agents.blackboard |
Redis-backed shared state with events and OCC |
MemoryCapability |
polymathera.colony.agents.patterns.memory |
Manages one memory scope with ingestion, retrieval, maintenance |
GameProtocolCapability |
polymathera.colony.agents.patterns.games |
Structured multi-agent deliberation protocol |
AgentHookRegistry |
polymathera.colony.agents.patterns.hooks |
Per-agent hook registration and dispatch |