Skip to content

Colony

Polymathera's no-RAG, cache-aware multi-agent framework for extremely long, dense contexts (1B+ tokens).

Colony is a framework for building tightly-coupled, self-evolving, self-improving, self-aware multi-agent systems (agent colonies) that reason over extremely long context without retrieval-augmented generation (RAG). Instead of fragmenting context into chunks and retrieving snippets, Colony keeps the entire context "live" over a cluster of one or more LLMs through a cluster-level virtual memory system that manages LLM KV caches in the same way an operating system manages (almost unlimited) virtual memory over finite physical memory.

Colony's Vision

Colony's goal is to be the most efficient country of geniuses in a datacenter — the ideal substrate for civilization-building AI.

Use Cases.

Colony is designed to extract or synthesize novel insights from a large established body of knowledge, where the greatest cost factor is associated with the input context size rather than the expected output length. For example, incrementally editing a large code monorepo, systemic vulnerability detection in a billion-line codebase, reverse-engineering advanced proprietary designs from public knowledge, or discovering novel connections or plausible conjectures across thousands of scientific papers.

Pre-Alpha Early Access

Colony is still in pre-alpha early access. The API is not stable and the framework is under active development. We welcome feedback and contributions, but be aware that breaking changes may occur.

Who should use Colony?

Colony is designed for engineers building complex multi-agent systems that require reasoning over extremely long contexts. It is not a general-purpose agent framework or a consumer product. If you are looking for a simple agent orchestration tool or a way to add tool use to an LLM, Colony may not be the right fit. It runs over a Ray cluster (local or in the cloud) and it can be resource-intensive and expensive.

Key Ideas

  • NoRAG: Colony keeps the full context live and accessible, not filtered through retrieval. Colony manages all kinds of context (code, text, data) through distributed KV cache paging, not vector search.

  • Cache-Aware Agents: Agents are aware of what's in GPU memory (at the cluster level) and consciously plan their work to maximize cache reuse.

  • Agents All the Way Down: General intelligence emerges from the right composition of agent capabilities and multi-agent patterns. Every cognitive process -- attention, memory, planning, confidence tracking -- is a pluggable policy with a default implementation.

  • Distributed Reasoning Patterns: Multi-agent game protocols (hypothesis games, contract nets, negotiation) combat specific LLM failure modes like hallucination, laziness, and goal drift.

Getting Started

pip install polymathera-colony

See the Installation guide and Quick Start tutorial.

Architecture at a Glance

Agent System Agent 1 Capabilities (Memory, Games, Confidence, ...) ActionPolicy (MPC) Hook System (AOP) Agent 2 Capabilities (Memory, Games, Confidence, ...) ActionPolicy (MPC) Hook System (AOP) Agent N Capabilities (Memory, Games, Confidence, ...) ActionPolicy (MPC) Hook System (AOP) ··· Each agent is composed of capabilities wired together through its ActionPolicy: Memory · Attention · Grounding Confidence · Reflection · Planning Games (Hypothesis, Contract Net, Negotiation, Consensus) read / write / query / mmap infer_with_suffix / page_graph_ops Blackboard (Redis) Shared state & event pub/sub Optimistic concurrency (OCC) Agent coordination Memory scopes: Working · STM · LTM Episodic · Semantic · Procedural External Sources Git repos · documents · KBs · APIs Virtual Context Memory (VCM) Page Table · Page Attention Graph · Cache Scheduling · Page Faults LLM Cluster (GPU Nodes) LLM Node 1 KV Cache LLM Node 2 KV Cache LLM Node N KV Cache Context Sources (mapped as pages) Git Repos Knowledge Bases Blackboard Data Custom Nonuniform pages · Soft/hard affinity · Advisory/mandatory groups · Prefetching Amortized cost: O(N²) → O(N log N) as page graph stabilizes mmap / munmap mmap / munmap Infrastructure: Ray Cluster (actors, autoscaling) · Redis (state, pub/sub)

Documentation

Section Description
Getting Started Installation and initial setup instructions
Examples Gallery Collection of example use cases and applications
Philosophy Why Colony exists and what makes it different
Architecture Technical architecture of each subsystem
Design Insights Deep dives into novel design decisions
Guides Practical how-to guides
API Reference Detailed API documentation
Contributing How to contribute to Colony