⌂ › Using Engram › Documentation

Quickstart

This guide walks you through installing Engram, registering your first tool, and testing intelligent routing. By the end, you'll know the key features and how to explore further.

1. Install Engram

Run the one-line installer:

# Linux / macOS / WSL2
curl -fsSL https://get.engram.dev/install | bash

Windows Users Install WSL2 first, then run the command above inside your WSL2 terminal. Native Windows is supported via the engram.bat wrapper for local development, but production deployments should use WSL2, Docker, or a Linux server.

After it finishes, reload your shell:

source ~/.bashrc   # or source ~/.zshrc

2. Start the Gateway

Engram runs a lightweight backend API that powers the CLI, SDK, and all tool integrations. Start it with a single command:

engram run

You'll see the animated ENGRAM banner, the gateway URL, and an interactive REPL prompt. The backend starts automatically in the background — no separate server process to manage.

  ███████╗███╗   ██╗ ██████╗ ██████╗  █████╗ ███╗   ███╗
  ██╔════╝████╗  ██║██╔════╝ ██╔══██╗██╔══██╗████╗ ████║
  █████╗  ██╔██╗ ██║██║  ███╗██████╔╝███████║██╔████╔██║
  ██╔══╝  ██║╚██╗██║██║   ██║██╔══██╗██╔══██║██║╚██╔╝██║
  ███████╗██║ ╚████║╚██████╔╝██║  ██║██║  ██║██║ ╚═╝ ██║
  ╚══════╝╚═╝  ╚═══╝ ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝╚═╝     ╚═╝

  Connect any AI agent to any tool
  from your terminal.

  Gateway: http://127.0.0.1:8000
  API docs: http://127.0.0.1:8000/docs

$ engram

The banner animates on launch — a white-to-blue sweep followed by blue-to-white — and settles into the REPL. From here, every Engram subcommand is available without re-typing engram as a prefix. Type help to see all available commands, or exit to shut the gateway down.

Tip: If you prefer a visual debugging dashboard instead of the REPL, start with engram run --debug to launch the Textual-based TUI with live trace panels, event monitors, task tracking, and routing visualizations. This is particularly useful for watching multi-agent orchestration in real time.

3. Register Your First Tool

Engram connects agents to any API, CLI tool, or service. The quickest way to register a tool is by pointing at an OpenAPI spec:

engram register openapi https://petstore.swagger.io/v2/swagger.json

The system: 1. Fetches and validates the OpenAPI specification 2. Extracts endpoints, parameters, and response schemas 3. Auto-generates dual MCP and CLI representations 4. Aligns fields through the semantic ontology (protocols.owl) 5. Stores the tool in the registry for immediate agent discoverability

You'll see schema mismatch resolution happening in real time, followed by a registration summary:

ℹ Info: 3 schema mismatches resolved via ontology alignment

╭──── [*] Registration Summary ─────────────────────────────╮
│ Successfully registered: Petstore API                      │
│ ID: 7a3f2b1c-...                                          │
│ Test Command: engram run --tool Petstore API --inspect     │
╰────────────────────────────────────────────────────────────╯

You're not limited to OpenAPI specs. Engram's universal onboarding accepts multiple source formats:

Source Type	Command	What it does
OpenAPI / Swagger URL	`engram register openapi <url>`	Fetches spec, generates dual MCP+CLI schemas
Local OpenAPI file	`engram register openapi ./spec.yaml`	Same as above, from a local file
Partial documentation	`engram register openapi "<text>" --partial`	Extracts tool structure from freeform docs
Shell command	`engram register command docker`	Parses `--help` text, synthesizes semantic wrapper
Interactive wizard	`engram register tool`	Step-by-step manual registration with prompts

The interactive wizard (engram register tool) walks you through each field: name, description, base URL, path, HTTP method, and parameters. It's the best option when you don't have a spec and want full control over the tool definition.

4. Verify Your Tools

List everything that's been registered:

engram tools list

The output is a Rich table showing each tool's name, backend (MCP, CLI, or Dual), semantic type, success rate, and description:

                   Engram Tool Catalog
┏━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ S ┃ Tool / Extension   ┃ Backend  ┃ Semantic Type┃ Success ┃ Description                  ┃
┡━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ * │ Petstore API       │ MCP      │ Universal    │  100.0% │ Swagger's sample pet store…  │
│ * │ docker             │ CLI      │ Container    │   98.5% │ Docker container management… │
│ > │ Slack              │ Dual     │ Messaging    │   99.0% │ Pre-optimized Slack integ…   │
└───┴────────────────────┴──────────┴──────────────┴─────────┴──────────────────────────────┘

Showing 3 active tools. Use --popular to see pre-optimized integrations.

The * marker indicates your custom-registered tools (hero tools). The > marker indicates pre-optimized tools from the built-in catalog. Add --popular to include pre-optimized wrappers for common apps like Slack, GitHub, and Stripe. Use --filter <query> for fuzzy search by name, description, or tag.

5. Test Intelligent Routing

Ask Engram to route a natural-language task to the best tool and backend:

engram route test "send an email to the team"

The router evaluates all registered tools, scores them on five dimensions, then selects the optimal backend:

╭──── ▶ Optimal Routing Decision ──────────────────╮
│ Chosen Tool: Slack                                │
│ Backend: MCP                                      │
│ Confidence: 87.3%                                 │
│ Predicted Latency: 245ms                          │
│ Estimated Cost: 12.5 tokens                       │
│                                                   │
│ Reasoning: Highest composite score for messaging  │
│ tasks. MCP backend selected for structured        │
│ reliability over CLI speed.                       │
╰──────────────────────────────────────────────────╯

  Alternative Backends Comparison
┏━━━━━━━━━━━┳━━━━━━━┳━━━━━━┳━━━━━━━━━┳━━━━━━━━━┓
┃ Backend   ┃ Score ┃ Sim. ┃ Latency ┃ Success ┃
┡━━━━━━━━━━━╇━━━━━━━╇━━━━━━╇━━━━━━━━━╇━━━━━━━━━┩
│ * MCP     │  0.87 │ 0.82 │   245ms │   99.0% │
│   CLI     │  0.71 │ 0.78 │   120ms │   95.2% │
└───────────┴───────┴──────┴─────────┴─────────┘

The scoring algorithm weights five factors:

Semantic Similarity (55%) — sentence-embedding match between your task description and tool descriptions
Success Rate (20%) — historical success rate for this tool/backend combination
Latency (15%) — average execution time from past runs
Token Cost (7%) — efficiency of the backend in token consumption
Context Overhead (3%) — how much prompt engineering the backend requires

Tip: Force a specific backend for debugging with --force-mcp or --force-cli. This is useful when testing how the same task performs across different execution paths.

6. Try Key Features

Now that you have tools registered and routing working, explore these capabilities:

Check self-healing status

engram heal status

The reconciliation engine continuously monitors for schema drifts and field mismatches between your registered tools and the actual APIs they connect to. This command shows any detected drifts, their confidence scores, and whether they've been auto-repaired or need manual review.

   Semantic Drift Analysis
╭──────────────────┬──────────────┬──────────────┬───────┬─────────────╮
│ Source Protocol   │ Field Drift  │ Ontology Match│ Conf. │ Status     │
├──────────────────┼──────────────┼──────────────┼───────┼─────────────┤
│ N/A              │ No active    │ -            │ -     │ HEALTHY    │
│                  │ drifts       │              │       │            │
╰──────────────────┴──────────────┴──────────────┴───────┴─────────────╯

When drifts are detected, the engine evaluates each one and either auto-repairs (confidence ≥ 70%) or flags it for manual review. Use heal status --verbose for full telemetry excerpts, or heal now to trigger an immediate repair loop.

Inspect execution traces

engram trace list

Every tool execution is traced with semantic detail — routing reasoning, ontology alignment, healing steps, and performance metrics. The trace table shows timestamp, trace ID, tool, backend, success/failure, and token cost.

Use engram trace detail . (dot for "latest") to drill into the most recent execution with a full semantic inspection tree:

engram trace detail .

This shows: - An AI-generated natural-language summary of the routing and healing decisions - The execution path with tool selection, routing choice, latency, and backend used - Performance weights with semantic similarity, composite score, and token efficiency - Self-healing steps (if any drift was detected and repaired) - Ontological alignment details with synthesized field mappings

View your identity

engram auth whoami

Displays your EAT (Engram Authorization Token) identity, structured permissions per tool, and semantic scopes derived from the security.owl ontology. The output is a Rich tree showing:

Your identity (the sub claim from the JWT)
Structured permissions per tool (what you can do with each registered tool)
Semantic scopes with ontology context (e.g., execute:tool-invocation → "Can invoke cross-protocol tool translations")

Translate between protocols

engram protocol translate --from mcp --to cli

Performs real-time translation between MCP, CLI, A2A, and ACP protocols using the semantic ontology as a canonical bridge. Without a --payload flag, it uses a demonstration payload. The output shows three panels side by side:

Source panel — the original payload in the source protocol format
Canonical Bridge panel — the intermediate ontology-normalized representation
Target panel — the translated payload in the target protocol format

This is invaluable for debugging cross-protocol integrations and understanding how Engram normalizes data between agent frameworks.

Quick Reference

Command	Description
`engram run`	Start the gateway and interactive REPL
`engram run --debug`	Start with the TUI debugging dashboard
`engram register openapi <url>`	Register a tool from an OpenAPI spec
`engram register command <cmd>`	Register a shell command as a tool
`engram register tool`	Interactive manual tool registration
`engram tools list`	List all registered tools
`engram tools list --popular`	Include pre-optimized integrations
`engram tools search <query>`	Fuzzy search across all tools
`engram route test "<desc>"`	Test intelligent routing for a task
`engram route list`	Show all tools with routing stats
`engram heal status`	Check self-healing reconciliation status
`engram heal now`	Trigger immediate repair loop
`engram trace list`	List recent semantic execution traces
`engram trace detail .`	Inspect the latest trace in detail
`engram evolve status`	View ML tool improvement dashboard
`engram evolve apply <id>`	Apply a proposed tool refinement
`engram protocol translate`	Translate between agent protocols
`engram protocol handoff simulate`	Simulate multi-agent handoff
`engram sync list`	List active sync connections
`engram sync status`	Live event stream monitoring
`engram auth login`	Authenticate and retrieve an EAT
`engram auth whoami`	Show current identity and scopes
`engram config show`	Display current configuration
`engram config set <key> <value>`	Update a configuration value
`engram info`	Show system information and status

Next Steps

Installation — Detailed installation guide with prerequisites, manual setup, and troubleshooting
CLI Reference — Master every command, subcommand, and flag
Configuration — Customize routing weights, ontology paths, and provider settings
Universal Onboarding — Deep dive into connecting any API or CLI tool
Self-Healing Engine — Understand how OWL ontologies and ML keep your tools aligned
SDK & Python Library — Integrate Engram programmatically into your applications

Installation

This page covers every way to install Engram — from the one-line installer to a full manual setup. Pick the path that matches your environment and comfort level.

Quick Install

Linux / macOS / WSL2

curl -fsSL https://get.engram.dev/install | bash

This single command handles everything: directory creation, dependency installation, CLI setup, and PATH configuration. Within 60–90 seconds, the engram command is available globally.

Windows

Native Windows is supported for local development via the engram.bat self-healing entry point. For production deployments, use WSL2, Docker, or a Linux server.

Option A — WSL2 (Recommended)

If you have WSL2 installed, open your WSL2 terminal and run the Linux installer above. This gives you the full Unix experience with no compatibility caveats.

Option B — Native Windows

git clone https://github.com/kwstx/engram_translator.git
cd engram_translator
.\engram.bat

The .bat entry point auto-creates a Python virtual environment, installs all dependencies from requirements.txt, validates imports, and launches the CLI. It's functionally identical to the Unix ./engram script.

Note: The native Windows path works for development and testing. Some optional features (systemd services, Unix signal handling) are not available on Windows. Use WSL2 or Docker for production workloads.

What the Installer Does

The one-line installer performs these steps automatically:

Creates ~/.engram/ — The configuration directory where config.yaml, encrypted credentials, and the swarm memory database live
Clones the repository — Pulls the latest code from main into the installation directory
Creates a Python virtual environment — Isolates Engram's dependencies from your system Python
Installs dependencies — Runs pip install -r requirements.txt to install all core and optional packages
Initializes configuration — Writes a default config.yaml with sensible defaults
Creates CLI wrapper — Installs an engram shell script to ~/bin/ or ~/.local/bin/
Updates PATH — Adds the bin directory to your shell profile (.bashrc, ~/.zshrc, or .profile)
Optionally starts the background service — On Linux, creates a systemd user service; on macOS, creates a launchd plist

By the end, you can run engram run from any directory to start the gateway and interactive REPL.

After Installation

Reload your shell and start the gateway:

source ~/.bashrc   # or: source ~/.zshrc
engram run         # Start the gateway and interactive REPL

The gateway binds to http://127.0.0.1:8000 by default. The REPL drops you into an interactive session where every Engram subcommand is available without typing engram as a prefix.

To configure individual settings later, use the dedicated commands:

engram config set backend_preference mcp   # Set default routing backend
engram config set model_provider openai     # Set AI model provider
engram auth login                          # Authenticate and get an EAT
engram info                                # Check system status

Tip: Run engram info after installation to verify that the config path, API URL, backend preference, and authentication status are all correct.

Prerequisites

The only prerequisites are Git and Python 3.11+. The installer and self-healing entry points handle everything else.

Requirement	Purpose	Notes
Python 3.11+	Gateway, CLI, SDK, and all core services	Required
Git	Cloning the repository and version management	Required
pip	Python package management	Installed automatically if missing
Node.js 18+	Playground frontend and browser automation	Optional
PostgreSQL 15+	Production database	Optional — SQLite used by default for local dev
Redis 7+	Event streams, semantic caching, rate limiting	Optional — auto-disabled when not available
Docker	Containerized deployment	Optional — for production/staging setups

Important: You do not need to install PostgreSQL or Redis for local development. Engram automatically detects the runtime environment and falls back to SQLite and in-memory alternatives when Docker infrastructure isn't available. The smart fallback logic in app/core/config.py checks for the presence of /.dockerenv and KUBERNETES_PORT to decide which backend to use.

Just make sure git and python3 (or python on Windows) are on your PATH:

git --version    # Should print git version 2.x+
python3 --version # Should print Python 3.11+

Manual Installation

If you prefer full control over the installation process — or you're setting up a development environment — follow these steps.

Step 1: Clone the Repository

git clone https://github.com/kwstx/engram_translator.git
cd engram_translator

Step 2: Create Virtual Environment

python3 -m venv venv
source venv/bin/activate   # Linux/macOS
# or: .\venv\Scripts\activate   # Windows PowerShell

Step 3: Install Python Dependencies

pip install --upgrade pip
pip install -r requirements.txt

The requirements.txt includes all core and optional dependencies:

Category	Key Packages
Web framework	`fastapi`, `uvicorn[standard]`, `httpx`
Database	`sqlalchemy[asyncio]`, `sqlmodel`, `asyncpg`, `aiosqlite`, `alembic`
Semantic layer	`rdflib`, `owlready2`, `pyswip`, `pyDatalog`
ML / Embeddings	`scikit-learn`, `sentence-transformers`, `torch`, `joblib`
CLI / TUI	`typer[all]`, `rich`, `textual`
Auth / Security	`python-jose[cryptography]`, `passlib[bcrypt]`, `keyring`, `cryptography`
MCP	`mcp` (Model Context Protocol SDK)
Monitoring	`prometheus-fastapi-instrumentator`, `sentry-sdk`, `structlog`
Task queue	`celery` (for evolution pipeline)
Config	`pydantic-settings`, `pyyaml`

Tip: For a minimal installation without ML features (sentence-transformers, torch, celery), install only the core dependencies listed in pyproject.toml. This reduces the installation footprint from ~2GB to ~200MB, suitable for constrained environments or CI pipelines.

Step 4: Initialize Configuration

mkdir -p ~/.engram
engram init   # Or: python -m app.cli init

This creates the configuration directory at ~/.engram/ and writes the default config.yaml:

# ~/.engram/config.yaml
api_url: http://127.0.0.1:8000
backend_preference: mcp
model_provider: openai
verbose: false

Each field is explained in the Configuration guide. You can also set values via the CLI:

engram config set api_url http://my-server:8000
engram config set backend_preference cli

Step 5: Add API Keys

Provider credentials can be configured in three ways:

Option A — Environment variables (.env file)

Create a .env file at the project root:

# .env
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
SLACK_API_TOKEN=xoxb-...

This is the simplest approach for local development and Docker deployments.

Option B — System keyring

engram auth login --token <your-eat-token>

EAT tokens are stored in the system keyring (macOS Keychain, Windows Credential Locker, or Linux Secret Service) for maximum security. Falls back to config.yaml if the keyring is unavailable.

Option C — TUI service connection

Launch the TUI dashboard (engram run --debug) and use the service connection screens to input API keys for each provider. Credentials are encrypted with Fernet symmetric encryption and stored in ~/.engram/config.enc.

Step 6: Start the Gateway

engram run

Or start the backend directly with uvicorn for more control:

uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

The --reload flag enables hot-reloading during development. The FastAPI application initializes the orchestration services, database connections, API routers, and background workers automatically via the lifespan handler.

Step 7: Verify the Installation

engram info      # Check CLI configuration and connection status
engram tools list   # Verify tool registry is accessible

Successful output from engram info shows:

╭──── System Information ──────────────────╮
│ Config Path   ~/.engram/config.yaml      │
│ API URL       http://127.0.0.1:8000      │
│ Backend       mcp                        │
│ Auth Status   Authenticated              │
│ EAT Token     ****abc1                   │
╰──────────────────────────────────────────╯

Quick-Reference: Manual Install (Condensed)

For those who just want the commands:

# Clone & enter
git clone https://github.com/kwstx/engram_translator.git
cd engram_translator

# Create venv
python3 -m venv venv
source venv/bin/activate

# Install everything
pip install --upgrade pip
pip install -r requirements.txt

# Configure
mkdir -p ~/.engram
engram init
engram config set model_provider openai

# Start
engram run

Self-Healing Entry Points

Engram ships with self-healing bootstrap scripts that automatically manage the virtual environment and dependencies on every launch:

Platform	Entry Point	What it does
Linux / macOS	`./engram`	Checks venv, validates imports, installs missing deps, then launches CLI
Windows	`.\engram.bat`	Same as above for Windows Command Prompt and PowerShell

These scripts eliminate "it worked on my machine" problems. The startup sequence is:

Check for virtual environment — Creates one if missing
Fast import test — Attempts to import critical modules (fastapi, rich, typer, rdflib)
Auto-repair — If any import fails, runs pip install -r requirements.txt automatically
Launch — Passes all arguments to the Engram CLI

If a dependency breaks after an update — say a new version of rdflib introduces an incompatibility — the entry point detects the failure via the import test and automatically re-synchronizes the environment before launching. No manual pip install required.

Runtime Environment Detection

Engram's configuration layer includes smart fallback logic that adapts to your runtime environment automatically:

# Simplified logic from app/core/config.py
if not os.path.exists("/.dockerenv") and not os.environ.get("KUBERNETES_PORT"):
    # Running locally — use SQLite instead of PostgreSQL
    if "db:5432" in DATABASE_URL or POSTGRES_SERVER == "db":
        DATABASE_URL = "sqlite+aiosqlite:///./engram.db"
    # Disable Redis if the default Docker hostname is configured
    if REDIS_HOST == "redis":
        REDIS_ENABLED = False

This means:

Environment	Database	Redis	Behavior
Docker Compose	PostgreSQL (via `db:5432`)	Redis (via `redis:6379`)	Full production stack
Kubernetes	PostgreSQL (managed RDS)	Redis (managed ElastiCache)	Full production stack
Local development	SQLite (`./engram.db`)	Disabled (in-memory fallback)	Zero-dependency startup

You never need to install PostgreSQL or Redis for local development. The smart fallback makes engram run work immediately after cloning.

Troubleshooting

Problem	Solution
`engram: command not found`	Reload your shell (`source ~/.bashrc`) or add `~/bin` to your PATH. On Windows, restart your terminal.
`Connection Error: Could not connect`	Start the backend first: `engram run` or `uvicorn app.main:app`. The gateway must be running for CLI commands to work.
API key errors	Run `engram auth login` to authenticate, or check `engram config show` for the current API URL.
Import errors on launch	The self-healing entry point should fix this automatically. If not, run `pip install -r requirements.txt` manually in the venv.
Database migration errors	Run `alembic upgrade head` to apply pending migrations. This is needed after updates that change the database schema.
Redis connection refused	Redis is optional for local dev. Engram auto-disables it when unavailable. No action needed.
`ModuleNotFoundError: No module named 'app'`	Make sure you're running from the project root directory (`translator_middleware/`), not a subdirectory.
Port 8000 already in use	Another service is using that port. Either stop it, or start Engram on a different port: `engram run --port 8001`.
`jwt.exceptions.DecodeError`	Your EAT token is malformed. Run `engram auth login` to get a fresh token.
Slow first startup	The initial launch downloads ML models (~400MB for sentence-transformers) and initializes the database. Subsequent starts are fast.

What's Next

Quickstart — Register your first tool and test routing (5 minutes)
Docker & Kubernetes Setup — Deploy with containers for production
CLI Reference — Master every command and flag
Configuration — Customize every setting

Docker & Kubernetes Setup

Engram ships with production-ready Docker Compose and Kubernetes configurations. Docker Compose is ideal for single-server deployments and development environments. Kubernetes (via the included manifests) is designed for multi-node production workloads with horizontal scaling.

Deployment Options

Deployment	Who it's for	What you get
Docker Compose	Single-server, dev, staging	Full stack with one command — app, Postgres, Redis, Prometheus, Grafana
Docker Compose (Staging)	Pre-production validation	Stripped-down stack with production-like settings
Kubernetes	Production at scale	Declarative manifests with autoscaling, health checks, and observability

Docker Compose Quick Start

# Clone the repo
git clone https://github.com/kwstx/engram_translator.git
cd engram_translator

# Copy environment template
cp .env.example .env
# Edit .env with your API keys and secrets

# Start the full stack
docker compose up -d

This starts six services:

Service	Port	Purpose
`app`	8000	Engram gateway API (FastAPI + Uvicorn)
`frontend`	3000	Playground UI (Vite + React)
`db`	5432	PostgreSQL 16 (persistent data)
`redis`	6379	Event streams, semantic caching, rate limiting
`prometheus`	9090	Metrics collection
`grafana`	3001	Dashboards and alerting

The app service hot-reloads on code changes (source volumes are mounted in development mode). For production, remove the volume mount and use the pre-built image.

Verify the Stack

# Check all services are running
docker compose ps

# Test the gateway
curl http://localhost:8000/health

# View application logs
docker compose logs -f app

# Access the API docs
open http://localhost:8000/docs    # or xdg-open on Linux

Environment Variables

The .env file at the project root is loaded by both Docker Compose and the application. Key variables:

Variable	Default	Description
`POSTGRES_USER`	`admin`	PostgreSQL username
`POSTGRES_PASSWORD`	`password`	PostgreSQL password (change in production!)
`POSTGRES_DB`	`translator_db`	Database name
`REDIS_HOST`	`redis`	Redis hostname (use `redis` for Docker, `localhost` for local dev)
`REDIS_PORT`	`6379`	Redis port
`REDIS_PASSWORD`	—	Redis password (optional, recommended for production)
`MODEL_PROVIDER`	`openai`	Default LLM provider for semantic operations
`ANTHROPIC_API_KEY`	—	API key for Claude-powered semantic reasoning
`PERPLEXITY_API_KEY`	—	API key for Perplexity search agent
`SLACK_API_TOKEN`	—	Slack OAuth token for messaging integration
`SENTRY_DSN`	—	Optional Sentry error tracking endpoint
`HTTPS_ONLY`	`false`	Force HTTPS redirect (set `true` in production)
`RATE_LIMIT_DEFAULT`	`100/minute`	Default API rate limit
`RATE_LIMIT_ENABLED`	`true`	Enable/disable rate limiting
`AUTH_JWT_SECRET`	—	JWT signing secret (auto-generated if not set)
`PROVIDER_CREDENTIALS_ENCRYPTION_KEY`	—	Fernet key for encrypting stored credentials
`LOG_LEVEL`	`INFO`	Application log level
`ENVIRONMENT`	`development`	Environment name (`development`, `staging`, `production`)

Warning: Never commit .env to version control. The .gitignore excludes it by default. For Kubernetes deployments, use Secrets objects instead.

Example `.env` for Production

# .env (production)
ENVIRONMENT=production
HTTPS_ONLY=true
POSTGRES_USER=engram_prod
POSTGRES_PASSWORD=<strong-random-password>
POSTGRES_DB=engram_production
REDIS_HOST=redis
REDIS_PASSWORD=<redis-password>
AUTH_JWT_SECRET=<64-char-random-string>
PROVIDER_CREDENTIALS_ENCRYPTION_KEY=<fernet-key>
ANTHROPIC_API_KEY=sk-ant-...
SENTRY_DSN=https://...@sentry.io/...
RATE_LIMIT_DEFAULT=50/minute
LOG_LEVEL=WARNING

Staging Configuration

docker compose -f docker-compose.staging.yml up -d

The staging compose file is a stripped-down variant optimized for pre-production validation. Compared to the full development stack:

Feature	Development	Staging
Source volume mounts	✅ (hot-reload)	❌ (pre-built image)
Grafana	✅	❌
Prometheus	✅	✅ (minimal retention)
Resource limits	None	CPU/memory constrained
Environment	`development`	`staging`
Log level	`INFO`	`WARNING`

This is useful for validating deployment procedures, testing database migrations, and catching issues that only appear in container environments.

Kubernetes Deployment

Kubernetes manifests live in monitoring/k8s/. They include Deployments, Services, ConfigMaps, Secrets, and optional HorizontalPodAutoscalers for the gateway, worker, and scheduler components.

Apply All Manifests

# Apply all manifests
kubectl apply -f monitoring/k8s/

# Check status
kubectl get pods -l app=engram
kubectl logs -f deployment/engram-gateway

Architecture

The Kubernetes deployment separates concerns into distinct workloads:

Component	Replicas	Purpose
`engram-gateway`	2+ (HPA)	FastAPI API server — handles all HTTP traffic
`engram-worker`	1+	Background task worker — processes queued tasks
`engram-scheduler`	1	Workflow scheduler — triggers scheduled workflows
`engram-listener`	1	Event listener — processes Redis Stream events

PostgreSQL and Redis should be provisioned as managed services (RDS, ElastiCache, Cloud SQL, Memorystore) in production rather than running as pods. The manifests include environment variable references to external service endpoints.

Secrets

# Create secrets from .env file
kubectl create secret generic engram-secrets \
  --from-literal=POSTGRES_PASSWORD=<password> \
  --from-literal=AUTH_JWT_SECRET=<secret> \
  --from-literal=ANTHROPIC_API_KEY=<key> \
  --from-literal=PROVIDER_CREDENTIALS_ENCRYPTION_KEY=<fernet-key>

Health Checks

The gateway exposes health check endpoints that Kubernetes uses for liveness and readiness probes:

livenessProbe:
  httpGet:
    path: /health
    port: 8000
  initialDelaySeconds: 10
  periodSeconds: 30
readinessProbe:
  httpGet:
    path: /health
    port: 8000
  initialDelaySeconds: 5
  periodSeconds: 10

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: engram-gateway-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: engram-gateway
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Monitoring Stack

Engram exposes Prometheus metrics at /metrics via prometheus-fastapi-instrumentator. The included prometheus.yml and Grafana provisioning files give you dashboards for:

Request metrics — Rate, latency (p50/p95/p99), and error rate per endpoint
Tool routing — Backend selection distribution, routing confidence scores, cache hit rates
Self-healing — Drift detection frequency, auto-repair success rate, manual review queue depth
Circuit breaker — Trip count, cooldown events, per-destination failure tracking
Task queue — Queue depth, processing latency, lease expiration, retry count
Swarm Memory — Fact count, query latency, conflict resolution frequency
Event stream — Redis Streams lag, consumer group health, event processing rate

Access Grafana

open http://localhost:3001   # Default credentials: admin / admin

Pre-built dashboards are auto-provisioned from monitoring/grafana/dashboards/. No manual configuration needed — just log in and the dashboards are ready.

Prometheus Configuration

The included monitoring/prometheus.yml targets the Engram gateway:

scrape_configs:
  - job_name: 'engram'
    scrape_interval: 15s
    static_configs:
      - targets: ['app:8000']

Alerting

Configure Grafana alerts for critical conditions:

Alert	Condition	Action
High error rate	> 5% 5xx responses in 5 minutes	Page on-call
Circuit breaker tripped	Any destination circuit opens	Notify team
Task queue backup	Queue depth > 100 for 10 minutes	Scale workers
Drift detection spike	> 10 drifts detected in 1 hour	Review tool registrations

Persistent State

Event	Database	Redis	Tool Registry	Host Config
`docker compose restart`	✅ Persists (volume)	✅ Persists (in-memory)	✅ Persists	✅ Persists
`docker compose down`	✅ Persists (named volume)	❌ Lost	✅ Persists	✅ Persists
`docker compose down -v`	❌ Lost	❌ Lost	❌ Lost	✅ Preserved
Container image update	✅ Persists	Depends	✅ Persists	✅ Persists

Important: The postgres_data named volume preserves your database across docker compose down. To truly reset everything, add -v to remove volumes. The ~/.engram/ directory on the host is never affected by Docker operations.

Troubleshooting

Symptom	Cause	Fix
App can't connect to database	Docker internal DNS not ready	Ensure `db` service is healthy: `docker compose ps`. Check `depends_on` ordering.
Redis connection refused	Redis hasn't started yet	Check `depends_on` ordering, or restart: `docker compose restart redis`
Port already in use	Another service on 8000/5432/6379	Change port mappings in `docker-compose.yml`
Grafana shows no data	Prometheus not scraping	Check `monitoring/prometheus.yml` targets match service names
Slow startup on first run	Pulling images, installing deps	Subsequent starts are fast (cached layers). First pull is ~2GB.
Database migration errors	Schema out of sync	Run `docker compose exec app alembic upgrade head`
`asyncpg` connection errors	SSL mode incompatibility	Engram auto-strips `sslmode` for asyncpg. Check `DATABASE_URL` format.

What's Next

Configuration — Customize every setting for your deployment
EAT Identity & Security — Harden authentication for production
Observability & Tracing — Set up monitoring and alerting
Updating & Uninstalling — Keep your deployment current

Updating & Uninstalling

How to update Engram to the latest version, roll back if something breaks, and cleanly remove everything when you're done.

Updating

Standard Update (Git + pip)

cd /path/to/engram_translator
git pull origin main
pip install -r requirements.txt
alembic upgrade head   # Apply any new database migrations

If you used the one-line installer, pull the latest code into the installation directory and the self-healing entry point will automatically detect and install new dependencies on the next launch.

What Happens During an Update

git pull — Fetches the latest code from the main branch. This includes new features, bug fixes, ontology updates, and ML model improvements.
pip install -r requirements.txt — Picks up new or changed dependencies. Existing packages are skipped if they're already at the correct version.
alembic upgrade head — Applies database schema migrations. This is safe to run even if no migrations are pending — it's a no-op in that case.
Self-healing entry point — On the next launch via ./engram or .\engram.bat, the entry point re-validates all imports and reinstalls if anything is missing.

Post-Update Validation

engram info          # Verify configuration is intact
engram tools list    # Confirm tool registry is accessible
engram heal status   # Check for any new drifts after update

If engram info shows the correct API URL and authentication status, you're good. If engram heal status shows new drifts, this is expected — the update may have refined the ontology mappings. Review them or run engram heal now to auto-repair.

Rollback

cd /path/to/engram_translator

# List recent commits
git log --oneline -10

# Roll back to a specific commit
git checkout <commit-hash>
pip install -r requirements.txt
alembic downgrade -1   # Revert the last migration if needed

Warning: Rolling back database migrations can cause data loss if the newer migration added columns that already contain data. Always back up your database before downgrading:
# PostgreSQL
pg_dump translator_db > backup.sql

# SQLite
cp ./engram.db ./engram.db.backup

Rollback Strategies

Scenario	Strategy
Code bug (no schema changes)	`git checkout <hash>` — no migration needed
Bad migration	`alembic downgrade -1` + `git checkout <hash>`
Full reset	`git checkout <hash>` + restore database backup

Updating Docker Deployments

cd /path/to/engram_translator
git pull origin main
docker compose build app
docker compose up -d app   # Zero-downtime rolling restart

The up -d command only restarts services that have changed. Since the app image was rebuilt, only the gateway container restarts. Database and Redis continue running.

Kubernetes

# Update the image tag
kubectl set image deployment/engram-gateway engram=engram:latest

# Watch the rollout
kubectl rollout status deployment/engram-gateway

# If something goes wrong, roll back
kubectl rollout undo deployment/engram-gateway

For production, use specific image tags (not latest) and update them in your manifests:

spec:
  containers:
  - name: engram
    image: engram:v1.2.3  # Pin to specific version

Uninstalling

Remove the CLI

# Remove the global command
rm -f ~/bin/engram
rm -f ~/.local/bin/engram

Remove the Codebase

rm -rf /path/to/engram_translator

Remove Configuration and Data (Optional)

rm -rf ~/.engram

Note: Keep ~/.engram/ if you plan to reinstall later. It contains your config.yaml, EAT tokens, encrypted credentials, and swarm memory database. Removing it is a full reset.

Remove the Background Service

# Linux (systemd)
sudo systemctl stop engram && sudo systemctl disable engram
sudo rm /etc/systemd/system/engram.service
sudo systemctl daemon-reload

# macOS (launchd)
launchctl remove com.useengram.daemon
rm ~/Library/LaunchAgents/com.useengram.daemon.plist

Remove Docker Resources

cd /path/to/engram_translator
docker compose down -v   # Stop all services and remove volumes
docker rmi engram:latest # Remove the image

Remove the Keyring Entry

# Python
python -c "import keyring; keyring.delete_password('engram', 'eat_token')"

Complete Uninstall Checklist

Item	Command	What it removes
CLI wrapper	`rm ~/bin/engram`	The `engram` command
Codebase	`rm -rf /path/to/engram_translator`	All source code
Configuration	`rm -rf ~/.engram`	Config, tokens, swarm memory
Background service	`systemctl disable engram`	Auto-start daemon
Docker resources	`docker compose down -v`	Containers, volumes, networks
Keyring entry	`keyring.delete_password(...)`	Stored EAT token

What's Next

Installation — Reinstall from scratch
Docker & Kubernetes Setup — Deploy with containers

CLI Reference

Engram's CLI is a full terminal interface — not a web UI. It features an interactive REPL with all subcommands available inline, Rich-powered output with tables, panels, trees, and progress spinners, an animated banner, --json machine-readable output mode, and a visual TUI debugging dashboard. Built for people who live in the terminal.

Running the CLI

# Start the gateway + interactive REPL (default)
engram run

# Start with the visual TUI debugging dashboard
engram run --debug

# Bind to a custom host/port
engram run --host 0.0.0.0 --port 9000

When you run engram run, the CLI:

Suppresses all startup noise (import logs, uvicorn output)
Starts the FastAPI backend in a background daemon thread
Waits for the server to be ready (up to 60 seconds)
Clears the screen and plays the animated ENGRAM banner
Prints the gateway URL and API docs link
Drops into the interactive REPL

The REPL prompt is:

$ engram

From here, type any Engram subcommand without the engram prefix:

$ engram tools list
$ engram route test "send a message"
$ engram heal status
$ engram auth whoami

REPL Built-in Commands

Command	Description
`help`	Display all available commands in a formatted table
`clear`	Clear the screen and reprint the ENGRAM banner
`exit` / `quit` / `q`	Shut down the gateway and exit
Any other input	Delegated to the Typer CLI via subprocess

Tip: The REPL uses Rich's Console.input() for styled prompts. On terminals that support it, you get full color and Unicode rendering.

Debug TUI Mode

engram run --debug

This launches the full Textual-based TUI dashboard (tui/app.py) instead of the REPL. The TUI provides:

Panel	Location	What it shows
Connections	Top-left	Live connection events to external services
Agent Execution	Top-right	Agent step events during orchestration
Tool Usage	Middle-left	Tool invocation events with payloads
Responses	Middle-right	Response events from tools and agents
Translation	Center	Three-panel view: Engram Task → Tool Request → Tool Response
System Status	Right sidebar	FastAPI engine, discovery service, task worker status
Task Tracker	Right sidebar	Current task, progress with per-step agent tracking
Connected Services	Right sidebar	Status of each provider (Claude, Slack, etc.)
Log View	Bottom	Timestamped log stream of all events
Command Input	Bottom bar	Input field for tasks and `/commands`

The TUI requires authentication — it shows an inline login form on first launch. Credentials are encrypted with Fernet and stored in ~/.engram/config.enc.

Global Options

These flags apply to every engram command:

Flag	Type	Default	Description
`--json`	`bool`	`false`	Output in machine-readable JSON format
`--config`	`Path`	`~/.engram/config.yaml`	Path to a custom config file
`--help`	—	—	Show help text for any command

JSON Output Mode

When you pass --json, all Rich-formatted output (tables, panels, trees) is replaced with structured JSON. This makes every command scriptable and pipeable:

# Human-readable (default)
engram tools list

# Machine-readable
engram --json tools list | jq '.[] | select(.backend == "MCP")'

# Use in scripts
TOOL_COUNT=$(engram --json tools list | jq 'length')

Exit codes follow standard conventions: - 0 — Success - 1 — Error (authentication failure, API error, invalid input)

Core Commands

`engram init`

Initialize the Engram configuration and directory structure.

engram init

Creates ~/.engram/ and writes the default config.yaml. Safe to run multiple times — it overwrites with defaults.

Output:

╭──── Initialization Success ──────────────────────╮
│ Initialized Engram directory at ~/.engram         │
│ Config saved to ~/.engram/config.yaml             │
╰──────────────────────────────────────────────────╯

`engram info`

Display current CLI configuration and system status.

engram info

Shows the config file path, API URL, backend preference, authentication status, and a masked EAT token preview.

Output:

        System Information
┏━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Key         ┃ Value                         ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Config Path │ ~/.engram/config.yaml         │
│ API URL     │ http://127.0.0.1:8000         │
│ Backend     │ mcp                           │
│ Auth Status │ Authenticated                 │
│ EAT Token   │ ****abc1                      │
└─────────────┴───────────────────────────────┘

`engram run`

Start the Engram Protocol Bridge — interactive CLI mode.

engram run [OPTIONS]

Option	Type	Default	Description
`--host`	`str`	`127.0.0.1`	Host to bind the backend
`--port`	`int`	`8000`	Port to run the backend
`--debug`	`bool`	`false`	Start TUI dashboard instead of REPL

Auth Subgroup

Manage authentication and EAT (Engram Authorization Tokens).

`engram auth login`

Authenticate with the Engram backend to retrieve an EAT.

engram auth login [OPTIONS]

Option	Type	Default	Description
`--token`, `-t`	`str`	—	Directly input an EAT token (skips browser flow)
`--browser / --no-browser`	`bool`	`true`	Open login page in browser

Behavior: 1. If --token is provided, saves it directly to the keyring 2. Otherwise, opens the login URL in your browser 3. Prompts you to paste your EAT token (hidden input) 4. Saves the token and displays your identity via auth whoami

# Interactive login (opens browser)
engram auth login

# Direct token input
engram auth login --token eyJhbGciOiJIUzI1NiJ9...

# No browser (manual URL copy)
engram auth login --no-browser

`engram auth whoami`

Display current identity and a semantic permission summary.

engram auth whoami

Decodes the EAT JWT (without verifying the signature, since the CLI is a trusted client) and renders a Rich tree showing:

╭──── Current Session Identity ────────────────────╮
│ 👤 Identity: user@company.com                     │
│ ├── 🔐 Permissions (EAT Structured)              │
│ │   ├── slack                                     │
│ │   │   ├── send_message                          │
│ │   │   └── list_channels                         │
│ │   └── docker                                    │
│ │       ├── run                                   │
│ │       └── ps                                    │
│ ├── 🔬 Semantic Scopes (Ontology-based)           │
│ │   ├── execute:tool-invocation                   │
│ │   │   └── Can invoke cross-protocol translations│
│ │   └── read:ontology-metadata                    │
│ │       └── Can query tool catalogs               │
╰──────────────────────────────────────────────────╯

`engram auth scope`

Explore and visualize the semantic scopes assigned to your EAT.

engram auth scope

Renders a table mapping each semantic scope to its ontology context and capability:

          Semantic Access Scopes
╔═══════════════════════════╦══════════════╦═══════════════════════╗
║ Scope Identifier          ║ Ontology Ctx ║ Capability            ║
╠═══════════════════════════╬══════════════╬═══════════════════════╣
║ execute:tool-invocation   ║ Global       ║ Translation Execution ║
║ read:ontology-metadata    ║ Global       ║ Metadata Query        ║
╚═══════════════════════════╩══════════════╩═══════════════════════╝

`engram auth status`

Check current authentication status with expiration details.

engram auth status

`engram auth token-set`

Manually set the Engram Authorization Token.

engram auth token-set <token>

Config Subgroup

View and modify CLI configuration.

`engram config show`

Display the current configuration as a key-value table.

engram config show

`engram config set`

Set a configuration value.

engram config set <key> <value>

Supported keys match the EngramConfig model fields:

Key	Type	Default	Description
`api_url`	`str`	`http://127.0.0.1:8000`	Base URL for the Engram API
`backend_preference`	`enum`	`mcp`	Default backend: `mcp` or `cli`
`model_provider`	`str`	`openai`	AI model provider name
`verbose`	`bool`	`false`	Enable verbose logging

# Examples
engram config set api_url http://my-server:8000
engram config set backend_preference cli
engram config set model_provider anthropic
engram config set verbose true

The config is saved to ~/.engram/config.yaml in YAML format. Values are type-coerced automatically (booleans from "true"/"false", integers from numeric strings).

Tools Subgroup

Discover and manage tools (MCP & CLI).

`engram tools list`

List all registered tools with backend type, semantic tags, and performance stats.

engram tools list [OPTIONS]

Option	Type	Default	Description
`--popular`	`bool`	`false`	Include pre-optimized wrappers for popular apps
`--filter`, `-f`	`str`	—	Quick fuzzy filter for tool names or descriptions

The fuzzy filter uses difflib.SequenceMatcher and scores matches across tool name, description, and tags. Results are sorted by match score descending.

# List custom tools only
engram tools list

# Include popular pre-optimized apps
engram tools list --popular

# Fuzzy search
engram tools list --filter "weather"
engram tools list -f docker

`engram tools search`

Search for tools using fuzzy matching. Shortcut for tools list --filter.

engram tools search <query> [OPTIONS]

Option	Type	Default	Description
`--popular / --no-popular`	`bool`	`true`	Include popular app catalog in search

engram tools search slack
engram tools search "file management" --no-popular

Register Subgroup

Onboard and register new APIs, CLI manifests, or direct shell commands.

`engram register openapi`

Universal onboarding for APIs via OpenAPI specs or partial documentation.

engram register openapi <source> [OPTIONS]

Argument/Option	Type	Description
`source`	`str`	URL, local file path, or documentation text
`--agent-id`	`str`	Agent UUID to link the tool to (auto-detected if omitted)
`--partial`	`bool`	Treat source as partial documentation text

# From URL
engram register openapi https://api.example.com/openapi.json

# From local file
engram register openapi ./specs/weather-api.yaml

# From partial documentation
engram register openapi "The weather API has a GET /current endpoint that takes a city parameter" --partial

The registration pipeline:

Source validation — URL fetch, local file read, or partial text detection
Schema parsing — OpenAPI spec validation and endpoint extraction
Dual schema generation — Creates both MCP tool definition and CLI wrapper
Ontology alignment — Maps fields through protocols.owl, resolves 3+ schema mismatches
Registry storage — Tool is immediately discoverable by agents

`engram register command`

Onboard a local CLI tool by parsing its help text and generating a semantic wrapper.

engram register command <command> [OPTIONS]

Argument/Option	Type	Description
`command`	`str`	The shell command to register (e.g., `docker`, `kubectl`, `git`)
`--agent-id`	`str`	Agent UUID to link the tool to

engram register command docker
engram register command kubectl
engram register command ffmpeg

The system probes the shell for the command, parses its --help output, discovers subcommands, infers argument types, and synthesizes a semantic wrapper with both MCP and CLI schemas.

`engram register tool`

Start an interactive session to manually register a new tool.

engram register tool

The wizard prompts for:

Tool Name — Human-readable name
Description — What the tool does
Base URL — The API's base URL (e.g., https://api.weather.com)
Path — The endpoint path (e.g., /v1/current)
HTTP Method — GET, POST, PUT, or DELETE
Parameters — Name, type, description, and required flag for each parameter (repeat until done)

After confirming, the tool is registered via POST /api/v1/registry/manual with a synthetic OpenAPI schema.

Route Subgroup

Test and visualize performance-weighted routing decisions.

`engram route test`

Simulate routing for a task description and display choice reasoning.

engram route test <description> [OPTIONS]

Argument/Option	Type	Description
`description`	`str`	Natural language description of the task
`--force-mcp`	`bool`	Force routing to MCP backend
`--force-cli`	`bool`	Force routing to CLI backend

# Natural routing
engram route test "deploy the application to staging"

# Force a specific backend
engram route test "list all docker containers" --force-cli
engram route test "send a notification" --force-mcp

Output includes:

Optimal Routing Decision panel — Chosen tool, backend, confidence, predicted latency, estimated cost, and reasoning
Alternative Backends Comparison table — All candidates with composite score, similarity, latency, and success rate

`engram route list`

Display a sorted table of tools with historical performance statistics.

engram route list

Shows tool name, backend, average latency, success rate, average token cost, and sample count for all tools with routing history.

Heal Subgroup

Inspect and trigger semantic self-healing for tool drifts.

`engram heal status`

Query the reconciliation engine for detected semantic drifts and pending repairs.

engram heal status [OPTIONS]

Option	Type	Default	Description
`--verbose`, `-v`	`bool`	`false`	Show full logs and detailed drift analysis
`--fix`	`bool`	`false`	Trigger immediate repair loops if drifts are found

Output includes two tables:

Semantic Drift Analysis — Source protocol, field drift, ontology match, confidence, and status (AUTO-REPAIR or PENDING-REVIEW)
Persistent Semantic Mappings — Active mappings between protocols with version numbers

# Basic status
engram heal status

# Detailed with payload excerpts
engram heal status --verbose

# Check and fix in one command
engram heal status --fix

When --verbose is used, each pending drift includes a full JSON panel showing the payload excerpt and failure type.

`engram heal now`

Trigger immediate semantic repair loops for all detected drifts.

engram heal now

Calls POST /api/v1/reconciliation/heal and displays progress with Rich spinners. The engine queries the drift database, re-aligns with the semantic ontology, and synchronizes mapping tables.

Trace Subgroup

Observability and semantic execution tracing.

`engram trace list`

Renders a filterable Rich table of recent semantic execution traces.

engram trace list [OPTIONS]

Option	Type	Default	Description
`--limit`	`int`	`20`	Number of traces to show
`--tool`	`str`	—	Filter by tool name
`--export`	`bool`	`false`	Export as JSON for piping

# Recent traces
engram trace list

# Filter by tool
engram trace list --tool slack

# Export for analysis
engram trace list --export | jq '.[] | select(.success == false)'

`engram trace detail`

Detailed inspection including semantic path, routing reasoning, and healing steps.

engram trace detail <trace_id>

Argument	Type	Description
`trace_id`	`str`	Trace ID to inspect. Use `.` for the latest trace.

# Inspect a specific trace
engram trace detail 7a3f2b1c

# Shortcut: inspect the latest trace
engram trace detail .

# Export full trace as JSON
engram trace detail . --export

Output includes:

AI-generated Summary — Natural-language reasoning about the routing and healing decisions (generated by the backend LLM via POST /api/v1/traces/query)
Semantic Trace Tree — Rich tree visualization:
Execution Path — Tool selection, routing choice, backend, latency, success/failure
Performance Weights — Semantic similarity, composite score, token efficiency
Self-Healing Steps — Any reconciliation steps taken during execution
Ontological Alignment — Context interpretation and synthesized field mappings

Evolve Subgroup

Manage self-evolving tools and ML-driven improvements.

`engram evolve status`

Display ML improvement progress in a dashboard-like Rich layout.

engram evolve status

Shows: - Pipeline status (active/idle) - Pending proposal count - Total historical evolutions - Last ML update timestamp - Table of pending refinements with tool name, version change, refinement type, proposed changes, confidence score, and proposal ID

Refinement types include: - Description Path Refinement — Improved tool descriptions based on execution history - Parameter Schema Optimization — Tightened action schemas based on failure analysis - New Recovery Strategy — Pattern-based automated fallback mapping

`engram evolve apply`

Trigger updates with confirmation prompts and show before/after diffs.

engram evolve apply <evolution_id> [OPTIONS]

Argument/Option	Type	Description
`evolution_id`	`str`	ID (or prefix) of the evolution proposal to apply
`--force`, `-f`	`bool`	Apply without confirmation prompt

# Interactive apply with diff preview
engram evolve apply 7a3f2b1c

# Skip confirmation
engram evolve apply 7a3f2b1c --force

The command shows a preview of changes (before/after diffs) and asks for confirmation before applying. Once applied, the tool registry is hot-redeployed with the new version.

Protocol Subgroup

Federated protocol management and translation.

`engram protocol translate`

Perform real-time translation between protocols using the system ontology as a bridge.

engram protocol translate [OPTIONS]

Option	Type	Description
`--from`	`str`	Source protocol: `mcp`, `cli`, `a2a`, `acp`
`--to`	`str`	Target protocol: `mcp`, `cli`, `a2a`, `acp`
`--payload`, `-p`	`str`	JSON payload or path to JSON file (optional — uses demo payload if omitted)

# Translate MCP to CLI (demo payload)
engram protocol translate --from mcp --to cli

# Translate with custom payload
engram protocol translate --from a2a --to mcp --payload '{"task": "search", "query": "AI news"}'

# Translate from file
engram protocol translate --from cli --to a2a --payload ./request.json

Output: Three side-by-side panels showing Source → Canonical Bridge (Ontology) → Target.

`engram protocol handoff simulate`

Simulate a seamless multi-agent task handoff, demonstrating semantic state transfer.

engram protocol handoff simulate [OPTIONS]

Option	Type	Default	Description
`--source-agent`	`str`	`CLI-Local`	Name of the source agent
`--target-agent`	`str`	`Remote-MCP`	Name of the target agent

Output: A Rich tree showing session ID, semantic readiness, bridged protocols, and transferred state (Redis-backed) with full JSON payloads for each state category.

Sync Subgroup

Manage bidirectional synchronization and event monitoring.

`engram sync list`

List active event listeners, pollers, and CLI watchers.

engram sync list

`engram sync add`

Add a new bidirectional sync or event listener to a tool.

engram sync add <tool_id> [OPTIONS]

Option	Type	Default	Description
`--direction`	`str`	`both`	Sync direction: `both`, `to_mcp`, `from_mcp`
`--type`	`str`	`polling`	Source type: `polling` or `cli_watch`
`--url`	`str`	—	URL for polling (required for `polling` type)
`--command`	`str`	—	Command for CLI watch (required for `cli_watch` type)
`--interval`	`int`	`60`	Polling interval in seconds

# Add polling sync
engram sync add <tool-uuid> --type polling --url https://api.example.com/changes --interval 30

# Add CLI watch sync
engram sync add <tool-uuid> --type cli_watch --command "docker ps --format json"

`engram sync status`

Show live monitoring of recent events and semantic conflict resolutions.

engram sync status

Uses Rich's Live display with auto-refresh (2 frames/second) to show a real-time event stream. Press Ctrl+C to stop monitoring.

Interactive REPL Reference

When running inside engram run, all subcommands are available without the engram prefix. The REPL delegates to the Typer CLI via subprocess, so every feature works identically.

Full Command Map

┌─────────────────────────────────────────────────────────────────┐
│                     Available Commands                          │
├───────────────────────────────┬─────────────────────────────────┤
│ Command                      │ Description                     │
├───────────────────────────────┼─────────────────────────────────┤
│ tools list                   │ List all registered tools        │
│ tools search <query>         │ Search tools by name or tag      │
│ register openapi <url>       │ Register from OpenAPI spec       │
│ register command <cmd>       │ Register a shell command         │
│ register tool                │ Interactive manual registration  │
│ route test <tool>            │ Test routing decision            │
│ route list                   │ Show tools with routing stats    │
│ trace list                   │ List recent execution traces     │
│ trace detail <id>            │ Inspect a specific trace         │
│ heal status                  │ Self-healing status              │
│ heal now                     │ Trigger immediate repair loop    │
│ evolve status                │ ML improvement dashboard         │
│ evolve apply <id>            │ Apply a proposed refinement      │
│ protocol translate           │ Translate between protocols      │
│ protocol handoff simulate    │ Simulate multi-agent handoff     │
│ sync list                    │ List sync connections            │
│ auth whoami                  │ Show identity & scopes           │
│ clear                        │ Clear the screen                 │
│ exit                         │ Shut down the gateway            │
└───────────────────────────────┴─────────────────────────────────┘

TUI Dashboard Reference

The TUI dashboard (engram run --debug) is a full Textual application with multiple screens, interactive forms, and real-time event routing.

Screens

Screen	Trigger	Purpose
Welcome	On startup (if authenticated)	Animated logo and system status
Auth	On startup (if not authenticated)	Login/Signup form with base URL, email, password
Debug	`--debug` flag	Live trace panels and event monitors
Provider Selection	Service setup (`S` key)	Choose an AI provider to connect
Service Connect	Provider button click	Enter API key for a specific provider

Provider Connection Screens

Each provider has a dedicated connection screen with branding and instructions:

Provider	Screen Class	Auth Type
OpenAI	`OpenAIConnectScreen`	API Key
Claude (Anthropic)	`AnthropicConnectScreen`	API Key
Gemini (Google)	`GoogleConnectScreen`	API Key
Llama	`LlamaConnectScreen`	API Key
Mistral	`MistralConnectScreen`	API Key
Grok	`GrokConnectScreen`	API Key
Perplexity	`PerplexityConnectScreen`	API Key
DeepSeek	`DeepseekConnectScreen`	API Key
Other	`GenericServiceConnectScreen`	API Key

Event Routing

The TUI routes events from the backend to specific trace panels based on event type prefix:

Event Type Prefix	Panel	Example
`connection.*`	Connections	Connection established, connection lost
`agent.*`	Agent Execution	Agent step started, agent step completed
`tool.*`	Tool Usage	Tool invoked, tool response received
`response.*`	Responses	Final response generated
`translation.*`	Translation panels	Engram task, tool request, tool response

Task Tracking

The TUI automatically parses orchestration events to build a live task tracker:

Orchestration Plan detected → Sets total step count
Handing off to [Agent] → Marks step as RUNNING
Step N OK → Marks step as COMPLETED

The tracker shows current task text, status (IDLE → SUBMITTING → PLANNED → RUNNING → COMPLETED), per-step progress with agent names, and active connector list.

TUI Bridge

The app/core/tui_bridge.py module provides the event pipeline between the FastAPI backend and the TUI:

emit_tui_event(event) — Thread-safe event push to the shared async queue
tui_logger_processor — Structlog processor that translates technical log events into plain-English TUI messages with emojis:
"Translating message" → 🔄 Translating message from MCP to CLI...
"Applied version delta" → ✨ MCP message upgraded: v1.0 ➡️ v1.1
"Translation failed" → ❌ Translation failed: <error>
"No translation rule found" → ⚠️ Missing map: No path found for MCP to ACP
"Version mismatch detected" → ⚖️ Version mismatch in MCP: Found v1.0, expected v1.1

Credential Storage (TUI)

The TUI stores credentials differently from the CLI:

CLI	TUI
System keyring (`keyring` library)	Fernet-encrypted file (`~/.engram/config.enc`)
`config.yaml` fallback	Encryption key in `~/.engram/key` (chmod 600)
Plain text config	Encrypted JSON with base URL, tokens, email

Both paths are valid. The TUI's encrypted storage is designed for environments where the system keyring isn't available (headless servers, containers).

What's Next

Configuration — Customize every setting, routing weight, and ontology path
Universal Onboarding — Deep dive into tool registration
Self-Healing Engine — Understand the reconciliation engine
Observability & Tracing — Set up monitoring and alerting

Configuration

Engram's configuration system spans three layers — a YAML config file, environment variables, and a backend settings model — unified by a smart precedence chain that adapts to your runtime environment automatically.

Config File Location

~/.engram/config.yaml

Created by engram init. This file stores CLI-level preferences that affect how the engram command behaves. It is separate from the backend's app/core/config.py settings, which are configured via environment variables or .env files.

Config File Format

# ~/.engram/config.yaml
api_url: http://127.0.0.1:8000
backend_preference: mcp
model_provider: openai
verbose: false

CLI Configuration Fields

Field	Type	Default	Description
`api_url`	`str`	`http://127.0.0.1:8000`	Base URL for the Engram API. Change this when connecting to a remote gateway.
`backend_preference`	`enum`	`mcp`	Default backend for tool execution: `mcp` (structured reliability) or `cli` (speed).
`model_provider`	`str`	`openai`	Default AI model provider for semantic operations.
`verbose`	`bool`	`false`	Enable verbose logging for debugging. Shows keyring warnings and detailed error context.

Modify via CLI:

engram config set api_url http://my-server:8000
engram config set backend_preference cli
engram config set model_provider anthropic
engram config show   # Verify changes

Backend Settings Reference

The backend settings are defined in app/core/config.py as a Pydantic BaseSettings model. These are configured via environment variables or a .env file at the project root. The backend also loads ~/.engram/config.yaml at startup as a secondary source (environment variables take precedence).

Core Runtime

Variable	Type	Default	Description
`PROJECT_NAME`	`str`	`Agent Translator Middleware`	Application name for logging and metadata
`API_V1_STR`	`str`	`/api/v1`	API version prefix for all routes
`ENVIRONMENT`	`str`	`development`	Runtime environment: `development`, `staging`, `production`
`LOG_LEVEL`	`str`	`INFO`	Application log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`
`MODEL_PROVIDER`	`str`	`openai`	Default LLM provider
`BASE_URL`	`str`	`http://127.0.0.1:8000`	Base URL for self-referencing
`DEFAULT_PERSONALITY`	`str`	`optimistic`	Default personality for agent responses

Provider API Keys

Variable	Type	Default	Description
`ANTHROPIC_API_KEY`	`str`	—	Anthropic (Claude) API key
`PERPLEXITY_API_KEY`	`str`	—	Perplexity API key
`SLACK_API_TOKEN`	`str`	—	Slack OAuth token
`X_BEARER_TOKEN`	`str`	—	X (Twitter) bearer token

Routing Configuration

The routing engine uses a weighted composite score to choose the best tool and backend for each task. These weights control the relative importance of each factor:

Variable	Type	Default	Description
`ROUTING_EMBEDDING_MODEL`	`str`	`all-MiniLM-L6-v2`	Sentence-transformer model for semantic matching
`ROUTING_STATS_WINDOW_HOURS`	`int`	`168`	Rolling window for performance stats (7 days)
`ROUTING_CACHE_TTL_SECONDS`	`int`	`60`	Redis cache TTL for routing decisions
`ROUTING_WEIGHT_SIMILARITY`	`float`	`0.55`	Weight for semantic similarity score
`ROUTING_WEIGHT_SUCCESS`	`float`	`0.20`	Weight for historical success rate
`ROUTING_WEIGHT_LATENCY`	`float`	`0.15`	Weight for latency score
`ROUTING_WEIGHT_TOKEN_COST`	`float`	`0.07`	Weight for token efficiency
`ROUTING_WEIGHT_CONTEXT_OVERHEAD`	`float`	`0.03`	Weight for context overhead
`ROUTING_WEIGHT_PREFERENCE`	`float`	`0.10`	Weight for user's backend preference
`ROUTING_WEIGHT_PREDICTIVE`	`float`	`0.15`	Weight for predictive optimization
`ROUTING_BUDGET_TOKEN_LIMIT`	`int`	`8000`	Maximum token budget for routing
`ROUTING_PARALLEL_CONFIDENCE_THRESHOLD`	`float`	`0.05`	Minimum score gap to avoid parallel execution

Tuning Routing Weights

The weights should sum to approximately 1.0 (minor deviations are acceptable). Adjust them based on your workload priorities:

Priority	Recommended Tuning
Accuracy first	Increase `ROUTING_WEIGHT_SIMILARITY` to 0.7, decrease `ROUTING_WEIGHT_LATENCY` to 0.05
Speed first	Increase `ROUTING_WEIGHT_LATENCY` to 0.3, decrease `ROUTING_WEIGHT_SIMILARITY` to 0.3
Cost optimization	Increase `ROUTING_WEIGHT_TOKEN_COST` to 0.2, decrease `ROUTING_WEIGHT_PREDICTIVE` to 0.05
Reliability first	Increase `ROUTING_WEIGHT_SUCCESS` to 0.35, decrease others proportionally

ML Configuration

Variable	Type	Default	Description
`ML_ENABLED`	`bool`	`true`	Enable/disable ML-based mapping suggestions
`ML_MODEL_PATH`	`str`	`app/semantic/models/mapping_model.joblib`	Path to the trained mapping model
`ML_MIN_TRAIN_SAMPLES`	`int`	`20`	Minimum samples before training the ML model
`ML_AUTO_APPLY_THRESHOLD`	`float`	`0.85`	Confidence threshold for auto-applying ML suggestions
`ML_AUTO_RETRAIN_THRESHOLD`	`int`	`5`	Number of manual corrections before auto-retraining
`MAPPING_FAILURE_MAX_FIELDS`	`int`	`50`	Maximum fields to analyze on mapping failure
`MAPPING_FAILURE_PAYLOAD_MAX_KEYS`	`int`	`50`	Maximum payload keys to include in failure analysis

Database Configuration

Variable	Type	Default	Description
`DATABASE_URL`	`str`	Auto-generated	Full database connection URL (auto-built from Postgres settings)
`POSTGRES_SERVER`	`str`	`db`	PostgreSQL server hostname
`POSTGRES_USER`	`str`	`admin`	PostgreSQL username
`POSTGRES_PASSWORD`	`str`	`password`	PostgreSQL password
`POSTGRES_DB`	`str`	`translator_db`	Database name

Important: The default POSTGRES_SERVER is db, which resolves inside Docker Compose. For local development, the smart fallback automatically switches to SQLite (./engram.db). For production, set DATABASE_URL explicitly to your managed PostgreSQL instance.

Database URL Processing

Engram performs automatic URL processing to ensure compatibility:

postgres:// → postgresql+asyncpg:// (asyncpg compatibility)
sslmode=require → ssl=true (asyncpg doesn't accept sslmode)
Strips incompatible parameters: channel_binding, sslrootcert, sslcert, sslkey, sslcrl

Redis Configuration

Variable	Type	Default	Description
`REDIS_ENABLED`	`bool`	`true`	Enable/disable Redis integration
`REDIS_HOST`	`str`	`redis`	Redis server hostname
`REDIS_PORT`	`int`	`6379`	Redis port
`REDIS_DB`	`int`	`0`	Redis database number
`REDIS_PASSWORD`	`str`	—	Redis password (optional)
`REDIS_URL`	`str`	Auto-generated	Full Redis connection URL
`REDIS_CONNECT_TIMEOUT_SECONDS`	`float`	`0.2`	Connection timeout
`REDIS_SOCKET_TIMEOUT_SECONDS`	`float`	`0.2`	Socket timeout
`SEMANTIC_CACHE_TTL_SECONDS`	`int`	`600`	Cache TTL for semantic operations (10 minutes)

Note: Redis is optional for local development. When REDIS_HOST is redis (Docker default) and no Docker environment is detected, Redis is auto-disabled. All Redis-dependent features gracefully degrade.

Event Stream Configuration

Variable	Type	Default	Description
`EVENT_STREAM_KEY`	`str`	`engram:events`	Redis Stream key for events
`EVENT_STREAM_GROUP`	`str`	`engram-event-workers`	Consumer group name
`EVENT_STREAM_CONSUMER`	`str`	`worker-1`	Consumer name within the group
`EVENT_STREAM_BLOCK_MS`	`int`	`2000`	Read block timeout in milliseconds
`EVENT_STREAM_BATCH`	`int`	`25`	Number of events to read per batch
`EVENT_STREAM_MAXLEN`	`int`	`10000`	Maximum stream length (older events trimmed)
`EVENT_POLL_INTERVAL_SECONDS`	`float`	`10.0`	Fallback polling interval when Redis Streams unavailable

Security Configuration

Variable	Type	Default	Description
`AUTH_ISSUER`	`str`	`https://auth.example.com/`	JWT issuer claim
`AUTH_AUDIENCE`	`str`	`translator-middleware`	JWT audience claim
`AUTH_JWT_ALGORITHM`	`str`	`HS256`	JWT signing algorithm
`AUTH_JWT_SECRET`	`str`	—	JWT signing secret (required for production)
`AUTH_JWT_PUBLIC_KEY`	`str`	—	JWT public key (for RS256/ES256)
`ACCESS_TOKEN_EXPIRE_MINUTES`	`int`	`10080`	Session token lifetime (7 days)
`EAT_ACCESS_TOKEN_EXPIRE_MINUTES`	`int`	`15`	EAT access token lifetime (15 minutes)
`EAT_REFRESH_TOKEN_EXPIRE_MINUTES`	`int`	`10080`	EAT refresh token lifetime (7 days)
`PROVIDER_CREDENTIALS_ENCRYPTION_KEY`	`str`	—	Fernet key for encrypting stored provider credentials
`AUTH_FAIL_CLOSED`	`bool`	`true`	Deny access when Redis is down (fail-closed security)
`SEMANTIC_AUTH_FAIL_CLOSED`	`bool`	`true`	Deny access when semantic scope check fails
`HTTPS_ONLY`	`bool`	`false`	Force HTTPS redirect in production
`CORS_ORIGINS`	`list`	`["*"]`	Allowed CORS origins
`RATE_LIMIT_DEFAULT`	`str`	`100/minute`	Default API rate limit
`RATE_LIMIT_ENABLED`	`bool`	`true`	Enable/disable rate limiting
`SANDBOX_ENABLED`	`bool`	`true`	Enable sandbox mode for playground

Ontology Paths

Variable	Type	Default	Description
`DEFAULT_ONTOLOGY_PATH`	`str`	`app/semantic/protocols.owl`	Path to the protocol ontology
`SEMANTIC_SCOPE_ONTOLOGY_PATH`	`str`	`app/semantic/security.owl`	Path to the security scope ontology

Task Queue Configuration

Variable	Type	Default	Description
`TASK_POLL_INTERVAL_SECONDS`	`float`	`2.0`	How often the task worker checks for new tasks
`TASK_LEASE_SECONDS`	`int`	`60`	How long a task is leased to a worker before it's considered stale
`TASK_MAX_ATTEMPTS`	`int`	`5`	Maximum retry attempts for a failed task
`AGENT_MESSAGE_LEASE_SECONDS`	`int`	`60`	Lease duration for agent messages
`AGENT_MESSAGE_MAX_ATTEMPTS`	`int`	`5`	Maximum retry attempts for agent messages

Workflow Scheduler Configuration

Variable	Type	Default	Description
`WORKFLOW_SCHEDULER_POLL_SECONDS`	`float`	`5.0`	Polling interval for scheduled workflows
`WORKFLOW_SCHEDULER_BATCH_SIZE`	`int`	`20`	Number of workflows to process per batch

Trading Templates

Variable	Type	Default	Description
`TRADING_TEMPLATES_ENABLED`	`bool`	`true`	Enable trading template integrations
`BINANCE_API_KEY`	`str`	—	Binance exchange API key
`BINANCE_SECRET`	`str`	—	Binance exchange API secret
`COINBASE_API_KEY`	`str`	—	Coinbase API key
`COINBASE_SECRET`	`str`	—	Coinbase API secret
`KALSHI_API_KEY`	`str`	—	Kalshi prediction market API key
`KALSHI_SECRET`	`str`	—	Kalshi API secret
`ROBINHOOD_API_KEY`	`str`	—	Robinhood API key
`ROBINHOOD_SECRET`	`str`	—	Robinhood API secret
`STRIPE_SECRET_KEY`	`str`	—	Stripe secret key
`PAYPAL_CLIENT_ID`	`str`	—	PayPal client ID
`PAYPAL_SECRET`	`str`	—	PayPal client secret
`FRED_API_KEY`	`str`	—	Federal Reserve Economic Data API key
`REUTERS_APP_KEY`	`str`	—	Reuters data API key
`BLOOMBERG_SERVICE_ID`	`str`	—	Bloomberg terminal service ID

Local LLM (Ollama)

Variable	Type	Default	Description
`OLLAMA_BASE_URL`	`str`	`http://localhost:11434`	Ollama server base URL
`OLLAMA_MODEL`	`str`	`llama3.2`	Default Ollama model

Miscellaneous

Variable	Type	Default	Description
`SENTRY_DSN`	`str`	—	Sentry error tracking DSN
`PYTHON_INTERPRETER`	`str`	`python`	Python interpreter path for CLI tool execution

Precedence & Smart Fallback

Configuration values are resolved in this order (highest priority first):

Environment variables — ANTHROPIC_API_KEY=sk-ant-... in your shell or .env file
~/.engram/config.yaml — Loaded by the @model_validator(mode="before") hook in Settings
Defaults — Hardcoded defaults in the Settings class

Smart Fallback Logic

At startup, Engram detects whether it's running inside Docker/Kubernetes or locally:

# From app/core/config.py
if not os.path.exists("/.dockerenv") and not os.environ.get("KUBERNETES_PORT"):
    # Not in Docker or K8s — switch to local-friendly defaults
    if "db:5432" in DATABASE_URL or POSTGRES_SERVER == "db":
        DATABASE_URL = "sqlite+aiosqlite:///./engram.db"
    if REDIS_HOST == "redis":
        REDIS_ENABLED = False

This means you never need to install PostgreSQL or Redis for local development. The detection is automatic and zero-configuration.

Detected Environment	Database	Redis
Docker (`/.dockerenv` exists)	PostgreSQL via `db:5432`	Redis via `redis:6379`
Kubernetes (`KUBERNETES_PORT` set)	PostgreSQL via configured URL	Redis via configured URL
Local development (neither)	SQLite (`./engram.db`)	Disabled (in-memory fallback)

What's Next

CLI Reference — Every command, flag, and output format
Docker & Kubernetes Setup — Deploy with the right environment variables
EAT Identity & Security — Configure authentication and authorization
Observability & Tracing — Tune monitoring and alerting

Learning Path

This page helps you find the right documentation based on your experience level and goals. Engram has a lot of surface area — from basic tool registration to ML-driven self-evolving tools — so this guide ensures you don't waste time on sections you don't need yet.

Start Here

If you haven't installed Engram yet, begin with the Installation guide and then run through the Quickstart. Everything below assumes you have a working installation with at least one tool registered.

How to Use This Page

Know your level? Jump to the experience-level table and follow the reading order for your tier.
Have a specific goal? Skip to "By Use Case" and find the scenario that matches.
Just browsing? Check the "Key Features" table for a quick overview of everything Engram can do.

By Experience Level

Level	Goal	Recommended Reading	Time Estimate
Beginner	Install, register first tool, test routing	Installation → Quickstart → CLI Reference → Configuration	~1 hour
Intermediate	Set up healing, connect multiple protocols, deploy with Docker	Universal Onboarding → Self-Healing Engine → Hybrid Routing → Docker Setup → Observability	~2–3 hours
Advanced	Build custom adapters, extend the SDK, deploy at scale, contribute	Architecture → SDK & Python Library → Protocol Federation → Self-Evolving Tools → Contributing	~4–6 hours

By Use Case

"I want to connect my agents to APIs reliably"

Use Engram as the semantic middleware between your agents and any API, ensuring tools stay working as APIs evolve.

Reading order:

Tip: Register tools via OpenAPI specs for the fastest path. Engram auto-generates dual MCP and CLI representations and begins monitoring for schema drift immediately.

"I want intelligent routing between MCP and CLI"

Let Engram automatically choose the best execution backend for each task based on historical performance data.

Reading order:

"I want agents to communicate across protocols"

Bridge MCP, CLI, A2A, and ACP agents seamlessly with ontology-backed translation.

Reading order:

Tip: Protocol federation uses the OWL ontology as a canonical bridge between protocols. Payloads are normalized through semantic concepts, not brittle field mappings.

"I want to deploy Engram in production"

Run Engram at scale with Docker Compose or Kubernetes, full observability, and hardened security.

Reading order:

"I want to integrate Engram into my Python app"

Use the Engram SDK to programmatically register tools, translate payloads, manage tasks, and build agent workflows.

Reading order:

"I want to contribute to Engram"

Set up a development environment, understand the codebase structure, and submit your first PR.

Reading order:

Key Features at a Glance

Feature	What It Does	Docs Link
Universal Onboarding	Register any OpenAPI, GraphQL, CLI tool, or freeform docs as a dual MCP+CLI tool	Universal Onboarding
Self-Healing Engine	OWL ontologies + ML detect and fix schema drift, field mismatches, and output changes in real time	Self-Healing Engine
Hybrid MCP+CLI Routing	Performance-weighted routing chooses the best backend (MCP for structure, CLI for speed) per task	Hybrid Routing
Protocol Federation	Seamless translation and handoff between MCP, CLI, A2A, and ACP with multi-hop support	Protocol Federation
EAT Identity	Unified Engram Authorization Token with structured permissions and semantic scopes from the ontology	EAT Identity & Security
Bidirectional Sync	Event-driven synchronization across connected systems with semantic normalization and conflict resolution	Bidirectional Sync
Observability & Tracing	Rich semantic traces with routing reasoning, ontology alignment, healing steps, and LLM-generated summaries	Observability & Tracing
Self-Evolving Tools	ML continuously improves tool descriptions, parameter schemas, default values, and recovery strategies	Self-Evolving Tools
Swarm Memory	Persistent, ontology-aware fact store with Prolog reasoning and pyDatalog conflict resolution	Architecture
Delegation Engine	Native agent delegation and orchestration with natural-language intent detection and sub-task routing	Architecture
SDK & Python Library	Programmatic access to all Engram capabilities: auth, translation, task execution, tool registration	SDK & Python Library
Playground	Web-based sandbox UI for testing translations and exploring the tool catalog without authentication	Architecture

What to Read Next

Just finished installing? → Head to the Quickstart to register your first tool.
Completed the Quickstart? → Read the CLI Reference and Configuration to customize your setup.
Comfortable with the basics? → Explore Universal Onboarding, Self-Healing Engine, and Hybrid Routing to unlock the full power of the bridge.
Setting up for production? → Read Docker & Kubernetes Setup and EAT Identity & Security.
Ready to build? → Jump into the SDK & Python Library and Architecture to understand the internals.
Want practical examples? → Check the examples/ directory for tool registration scripts, SDK usage, and adapter patterns.

Tip: You don't need to read everything. Pick the path that matches your goal, follow the links in order, and you'll be productive quickly. You can always come back to this page to find your next step.

Universal Onboarding

Engram's universal onboarding system turns any API, CLI tool, or freeform documentation into a dual MCP+CLI tool definition — complete with semantic ontology alignment, field mapping, and immediate agent discoverability. This is the primary mechanism for expanding what your agents can do.

How Universal Onboarding Works

The onboarding pipeline follows a consistent flow regardless of source type:

Source Input → Schema Parsing → Ontology Alignment → Dual MCP+CLI Generation → Registry Storage → Agent Discoverability

Source Input — The user provides a URL, file path, shell command, documentation text, or interactive wizard answers
Schema Parsing — The system extracts endpoints, parameters, response types, and metadata from the source
Ontology Alignment — Fields are mapped through protocols.owl to establish semantic equivalences across protocols
Dual MCP+CLI Generation — Both a structured MCP tool definition and a CLI wrapper are generated from a single source
Registry Storage — The tool is stored in the registry via POST /api/v1/registry/ingest/* or POST /api/v1/registry/manual
Agent Discoverability — The tool is immediately available to all connected agents for routing and execution

OpenAPI Spec Ingestion

The most common onboarding path. Supports both Swagger 2.0 and OpenAPI 3.0+ specs.

# From URL
engram register openapi https://petstore.swagger.io/v2/swagger.json

# From local file
engram register openapi ./specs/my-api.yaml

What Happens

Validation — The spec is fetched (URL) or read (file) and validated for structural correctness
Endpoint Extraction — Each path/method combination becomes a tool action. Parameters are classified as path, query, header, or body.
Response Schema Inference — Response schemas define the tool's output structure, used for downstream field mapping
Semantic Tag Detection — The system infers semantic tags from endpoint names, descriptions, and parameter types (e.g., a /messages endpoint gets tagged as "Messaging")
Dual Schema Generation — Both MCP and CLI representations are created simultaneously

Example: Registering a Weather API

engram register openapi https://api.weather.com/v1/openapi.json

 ⠋ Validating remote OpenAPI spec...
 ⠋ Generating dual MCP/CLI schemas...
 ⠋ Refining ontology mappings...
ℹ Info: 3 schema mismatches resolved via ontology alignment

╭──── [*] Registration Summary ──────────────────────╮
│ Successfully registered: Weather API                │
│ ID: 8b4c3d2e-...                                   │
│ Test Command: engram run --tool Weather API --inspect│
╰────────────────────────────────────────────────────╯

Partial Documentation Ingestion

When you don't have a formal spec but have documentation text, API descriptions, or even README fragments:

engram register openapi "The weather API has a GET /current endpoint that takes a city parameter as a query string and returns temperature in Celsius" --partial

The --partial flag activates LLM-powered schema extraction:

The documentation text is sent to POST /api/v1/registry/ingest/docs
The backend uses the configured LLM provider to parse the text into a structured tool definition
Confidence scores are assigned to each extracted element (endpoint path, parameters, response type)
Low-confidence extractions are flagged for manual review in the registration summary

This is particularly useful for:

Internal APIs with informal documentation
Third-party APIs that don't publish OpenAPI specs
Rapid prototyping where you want to register a tool from a description

CLI Command Ingestion

engram register command docker
engram register command kubectl
engram register command git
engram register command ffmpeg

What Happens

Shell Probing — The system checks that the command exists and is executable
Help Text Parsing — Runs <command> --help and parses the output to discover subcommands, flags, and argument types
Subcommand Discovery — For complex CLI tools (e.g., docker, kubectl), recursively discovers subcommands
Argument Inference — Maps CLI flags to typed parameters (string, integer, boolean, array)
Semantic Wrapper Synthesis — Generates both an MCP tool definition and a CLI execution wrapper

The resulting tool can be executed by agents through either backend:

MCP backend — Structured JSON invocation with parameter validation
CLI backend — Direct shell execution with argument assembly

Manual Interactive Registration

For full control over every field, use the interactive wizard:

engram register tool

The wizard walks you through:

Engram Manual Tool Registration
This interactive session will guide you through registering a tool without an OpenAPI spec.

Tool Name: Weather Checker
Description: Get current weather for a city
Base URL (e.g., https://api.weather.com): https://api.weather.com
Path (e.g., /v1/current): /v1/current
HTTP Method [GET/POST/PUT/DELETE] (GET): GET

Define Parameters (Press Enter on 'Parameter Name' to finish)
Parameter Name (leave blank to finish): city
Parameter Type [string/integer/boolean/number/array/object] (string): string
Parameter Description (Description for city): The city name to check weather for
Is required? [yes/no] (yes): yes

Parameter Name (leave blank to finish):

Prepared tool configuration for 'Weather Checker'
Endpoint: GET https://api.weather.com/v1/current
Parameters (1): city

After confirmation, the tool is registered via POST /api/v1/registry/manual with a synthetic OpenAPI schema generated from your inputs.

Dual MCP+CLI Schema Generation

Every registered tool gets both representations automatically:

Aspect	MCP Schema	CLI Wrapper
Invocation	Structured JSON `{"name": "tool", "arguments": {...}}`	Shell command with flags
Validation	Pydantic model with type checking	Argument parsing with type coercion
Output	JSON response	Formatted text or JSON
Best for	Structured reliability, type safety	Speed, token efficiency, scripting

This duality is what enables the routing engine to choose the optimal backend per task. The same tool can be executed via MCP for reliability-critical workflows or via CLI for speed-sensitive tasks.

Ontology Alignment During Registration

When a tool is registered, its fields are mapped through the OWL ontology (protocols.owl):

Field Flattening — Nested JSON structures are flattened to dot.notation paths
Semantic Equivalence Detection — The SemanticMapper looks up each field in the ontology to find equivalent concepts across protocols
Cross-Protocol Normalization — Fields like city, location, place are recognized as semantically equivalent through ontology concepts
Initial Healing Baseline — The established mappings become the baseline for future drift detection

This is why Engram can automatically translate payloads between protocols — the ontology provides a shared vocabulary that bridges different naming conventions.

Pre-Optimized Popular Apps

Engram includes a catalog of pre-optimized tool definitions for popular services:

engram tools list --popular

The catalog includes warm-started definitions with:

Validated schemas tested against live APIs
Pre-computed semantic tags and ontology mappings
High-confidence routing data (seeded at 99% success rate)
Optimized CLI wrappers with efficient argument assembly

Note: Pre-optimized tools appear with a > marker in the tools list, while your custom-registered tools show *. Both are fully functional — pre-optimized tools just have a head start on the learning curve.

Adapter System

For deep integrations beyond simple API wrapping, Engram supports custom adapters:

# adapters/base.py pattern
class BaseAdapter:
    async def connect(self, credentials: Dict) -> bool: ...
    async def execute(self, action: str, params: Dict) -> Dict: ...
    async def health_check(self) -> bool: ...

Built-in adapters include:

Mirofish — Multi-agent orchestration connector
OpenClaw — AI tool marketplace integration
Claude/Anthropic — Direct LLM access
Perplexity — Search-augmented generation
Slack — Messaging and channel management

To add a new adapter, implement the BaseAdapter interface and register it in the connector registry. See the Contributing guide for step-by-step instructions.

What's Next

Self-Healing Engine — How registered tools stay working as APIs change
Hybrid Routing — How the router chooses between MCP and CLI backends
CLI Reference — Detailed command reference for all registration commands

Self-Healing Engine

Engram's self-healing engine is the core differentiator. It continuously monitors the semantic relationship between your registered tools and the actual APIs they connect to, automatically detecting and repairing schema drift, field mismatches, and output format changes — without human intervention for high-confidence fixes.

What Self-Healing Means

APIs change. Fields get renamed, response formats evolve, new parameters appear, old ones get deprecated. In traditional integration middleware, these changes cause silent failures or require manual updates. Engram's self-healing engine addresses this with three mechanisms:

Schema Drift Detection — Continuous monitoring of tool execution results against the registered schema. When a field is missing, renamed, or returns an unexpected type, a "drift" is created.
Automatic Field Remapping — The OWL ontology provides semantic equivalences that allow the engine to automatically map a renamed field (e.g., city_name → location) without manual configuration.
Confidence-Based Auto-Repair — Each proposed fix is scored by confidence. High-confidence fixes (≥ 70%) are applied automatically. Low-confidence fixes are queued for manual review.

# Check current self-healing status
engram heal status

# Trigger manual repair loop
engram heal now

# Detailed view with payload excerpts
engram heal status --verbose

OWL Ontology Layer

The semantic foundation of self-healing is built on two OWL ontologies:

`protocols.owl` — Protocol Ontology

Located at app/semantic/protocols.owl, this ontology defines:

Protocol concepts — MCP, CLI, A2A, ACP as formal ontology classes
Field semantics — Concepts like Location, Message, Timestamp that exist across all protocols
Equivalence relations — city ≡ location ≡ place — these are the same concept in different naming conventions
Hierarchical relationships — CityName is a subclass of Location, which allows inheritance-based matching

`security.owl` — Security Ontology

Located at app/semantic/security.owl, this ontology defines:

Permission concepts — What actions are allowed on what resources
Semantic scopes — Ontology-derived capabilities (e.g., execute:tool-invocation)
Access control relationships — How scopes map to tool capabilities

Both ontologies are loaded using rdflib and owlready2, providing SPARQL query support and OWL reasoning.

Semantic Mapper

The SemanticMapper class (app/semantic/semantic_mapper.py) is the engine that performs field-level translation:

How It Works

Field Flattening — Incoming payloads are flattened from nested JSON to dot.notation paths. For example, {"user": {"name": "John"}} becomes user.name.
Ontology Lookup — Each field path is looked up in protocols.owl using resolve_equivalent(). This returns the semantically equivalent field name in the target protocol.
Cross-Protocol Normalization — The BidirectionalNormalizer handles payload translation in both directions through the ontology bridge.
Dynamic Rule Synthesis — For novel field mappings that don't exist in the ontology, the DynamicRuleSynthesizer uses the configured LLM to propose new mapping rules.

Example

When an MCP tool call returns {"city_name": "San Francisco"} but the registered schema expects {"location": "San Francisco"}:

The execution result doesn't match the expected schema → drift detected
The SemanticMapper looks up city_name in the ontology → finds it's equivalent to location
A mapping city_name → location is proposed with 95% confidence
Since confidence ≥ 70%, the mapping is auto-applied
Future executions of this tool automatically translate the field

ML Mapping Model

In addition to ontology-based matching, Engram uses a scikit-learn classifier for ML-assisted mapping:

Training Pipeline

Data Collection — Every successful field mapping is logged as a training sample
Feature Extraction — Field names, types, nesting depth, and character n-grams are vectorized
Model Training — A scikit-learn pipeline trains on labeled mappings from app/semantic/models/mapping_model.joblib
Minimum Samples — Training requires at least ML_MIN_TRAIN_SAMPLES (default: 20) labeled examples
Auto-Retraining — After ML_AUTO_RETRAIN_THRESHOLD (default: 5) manual corrections, the model automatically retrains

Confidence Scoring

Each ML-suggested mapping gets a confidence score:

Score	Action
≥ 85% (`ML_AUTO_APPLY_THRESHOLD`)	Auto-applied without human review
70% – 84%	Auto-applied (ontology threshold) but flagged for review
< 70%	Queued as PENDING-REVIEW in the drift table

Reconciliation Engine

The reconciliation engine (POST /api/v1/reconciliation/heal) orchestrates the full healing cycle:

Query drift database — Fetches all pending drifts (failed field mappings from recent executions)
Score each drift — Combines ontology match, ML confidence, and historical correction data
Apply auto-repairs — For drifts above the confidence threshold, updates the persistent mapping table
Queue manual reviews — For low-confidence drifts, creates PENDING-REVIEW entries visible in heal status
Update mapping versions — Each mapping has a version number that increments on update

CLI Commands

# View drift analysis and active mappings
engram heal status

# Same with full telemetry payload excerpts
engram heal status --verbose

# Check status and trigger repair in one command
engram heal status --fix

# Trigger manual repair immediately
engram heal now

Drift Table Output

   Semantic Drift Analysis
╭──────────────────┬──────────────────┬──────────────────┬───────┬──────────────╮
│ Source Protocol   │ Field Drift      │ Ontology Match   │ Conf. │ Status       │
├──────────────────┼──────────────────┼──────────────────┼───────┼──────────────┤
│ MCP -> CLI       │ city_name        │ location         │ 95.0% │ AUTO-REPAIR  │
│ A2A -> MCP       │ taskDescription  │ (RESOLVE MANUAL) │ 45.0% │ PENDING-REV  │
╰──────────────────┴──────────────────┴──────────────────┴───────┴──────────────╯

Dynamic Rule Synthesizer

For completely novel field mappings that exist neither in the ontology nor in the ML model's training data, the DynamicRuleSynthesizer uses the configured LLM:

Context assembly — The field name, parent object structure, surrounding fields, and recently successful mappings are bundled into a prompt
LLM reasoning — The LLM proposes a mapping with justification
Confidence calibration — The raw LLM confidence is adjusted based on structural similarity between source and target schemas
Human review — LLM-generated rules always start in PENDING-REVIEW status, regardless of confidence

This ensures that the system can handle any mapping scenario while maintaining a human-in-the-loop for unprecedented cases.

Bidirectional Normalizer

The BidirectionalNormalizer handles payload translation in both directions:

Forward normalization — Source protocol → Canonical ontology form → Target protocol
Reverse normalization — Target protocol → Canonical ontology form → Source protocol

This bidirectionality is critical for:

Request translation — Converting an MCP tool call into a CLI invocation
Response translation — Converting a CLI output back into structured MCP format
Round-trip consistency — Ensuring that translate(translate(x)) produces semantically equivalent output

How Healing Decisions Are Traced

Every healing decision is recorded in the semantic trace system:

engram trace detail .

The Self-Healing Steps section of the trace tree shows:

Which fields triggered drift detection
What ontology concepts were consulted
Whether the repair was ML-assisted, ontology-based, or LLM-synthesized
The confidence score and whether it was auto-applied or manually reviewed

This integration means you can always audit why a particular field mapping was changed and when.

What's Next

Hybrid Routing — How healed tools influence routing decisions
Observability & Tracing — Monitor healing activity
Configuration — Tune ML thresholds and ontology paths

MCP + CLI Hybrid Routing

Engram's routing engine automatically selects the best execution backend — MCP for structured reliability or CLI for speed and token efficiency — for every tool invocation. The decision is based on a multi-factor weighted composite score that combines semantic similarity, historical performance, latency, cost, and user preferences.

Why Hybrid Routing

The agent ecosystem has two dominant execution paradigms:

Backend	Strengths	Weaknesses
MCP (Model Context Protocol)	Structured JSON schemas, type validation, rich error handling, standardized tool definitions	Higher token cost, more context overhead, slower for simple tasks
CLI (Command-Line Interface)	Fast execution, minimal token overhead, native shell integration, scriptable	Less structured output, fewer guarantees, harder to validate

One size doesn't fit all. A "list files" command is faster via CLI. A "create a customer record" action is safer via MCP. Engram's router learns from execution history and makes the right choice automatically — no manual backend selection needed.

Routing Algorithm

The composite score for each tool/backend candidate is:

score = (similarity × w_similarity) +
        (success_rate × w_success) +
        (latency_score × w_latency) +
        (token_efficiency × w_token_cost) +
        (context_score × w_context) +
        (preference × w_preference) +
        (predictive × w_predictive)

Default Weights

Factor	Weight	Variable	Purpose
Semantic Similarity	0.55	`ROUTING_WEIGHT_SIMILARITY`	How well the task description matches the tool
Success Rate	0.20	`ROUTING_WEIGHT_SUCCESS`	Historical success percentage
Latency	0.15	`ROUTING_WEIGHT_LATENCY`	Average execution time (lower is better)
Token Cost	0.07	`ROUTING_WEIGHT_TOKEN_COST`	Token consumption efficiency
Context Overhead	0.03	`ROUTING_WEIGHT_CONTEXT_OVERHEAD`	Prompt engineering overhead
User Preference	0.10	`ROUTING_WEIGHT_PREFERENCE`	Backend preference from config
Predictive	0.15	`ROUTING_WEIGHT_PREDICTIVE`	Forward-looking optimization

Note: Weights don't need to sum to exactly 1.0. The router normalizes the final scores. Adjust them in your .env or ~/.engram/config.yaml based on your workload priorities.

Sentence Embedding Matching

The semantic similarity factor uses sentence-transformers with the all-MiniLM-L6-v2 model:

Task description embedding — The user's natural-language description (e.g., "send a message to the team") is embedded into a 384-dimensional vector
Tool description embedding — Each registered tool's description is pre-embedded and cached
Cosine similarity — The router computes cosine similarity between the task and each candidate tool
Score normalization — Similarity scores are mapped to [0, 1] for composite scoring

The embedding model is loaded lazily on first use and cached in memory. Change it with ROUTING_EMBEDDING_MODEL if you need a different model (e.g., a multilingual variant).

Historical Performance Data

Every tool execution is tracked and feeds back into routing decisions:

Metric	Description	CLI Command
Success Rate	Percentage of executions that completed without error	`engram route list`
Average Latency	Mean execution time in milliseconds	`engram route list`
Average Token Cost	Mean token consumption per execution	`engram route list`
Sample Count	Number of executions in the stats window	`engram route list`

The rolling window (ROUTING_STATS_WINDOW_HOURS, default: 168 hours / 7 days) determines how far back the stats look. Older executions are excluded, so the router adapts to recent performance changes.

engram route list

      Global Tool Performance Stats
╔═════════════════╦════════════╦════════════╦═══════════╦══════════╦═════════╗
║ Tool Name       ║ Backend    ║ Avg Latency║ Success   ║ Avg Cost ║ Samples ║
╠═════════════════╬════════════╬════════════╬═══════════╬══════════╬═════════╣
║ Slack           ║ MCP        ║     245ms  ║    99.0%  ║ 12.5 tok ║     147 ║
║ Slack           ║ CLI        ║     120ms  ║    95.2%  ║  4.2 tok ║      38 ║
║ docker          ║ CLI        ║      85ms  ║    98.5%  ║  2.1 tok ║     203 ║
║ Petstore API    ║ MCP        ║     310ms  ║   100.0%  ║ 18.3 tok ║      12 ║
╚═════════════════╩════════════╩════════════╩═══════════╩══════════╩═════════╝

Semantic Caching

Routing decisions are cached in Redis to avoid recomputing embeddings and scores for identical queries:

Setting	Default	Purpose
`ROUTING_CACHE_TTL_SECONDS`	`60`	How long a routing decision is cached

Cache invalidation happens automatically when:

A new tool is registered
A tool's performance stats change significantly (> 10% success rate delta)
The tool registry is modified (description update, schema change),
Manual cache flush via the admin API

When Redis is unavailable (local dev), routing decisions are not cached and are computed fresh each time.

Parallel Confidence Threshold

When the gap between the top two candidates is below ROUTING_PARALLEL_CONFIDENCE_THRESHOLD (default: 0.05), the router considers running both backends simultaneously:

if top_score - second_score < ROUTING_PARALLEL_CONFIDENCE_THRESHOLD:
    # Scores are too close — consider parallel execution

This is useful for reliability-critical workflows where you want to compare results from both backends and select the one that completes first or produces the best output.

Predictive Optimization

The predictive factor looks forward based on:

Tool evolution trends — If a tool's ML improvement proposals suggest it's about to get better, the predictive score increases
Failure pattern predictions — If recent executions show an increasing error rate, the predictive score decreases
Upcoming maintenance — If the tool's source API has scheduled maintenance windows, the score adjusts

Forcing a Backend

Override the router for debugging and testing:

# Force MCP
engram route test "send notification" --force-mcp

# Force CLI
engram route test "list docker containers" --force-cli

When a backend is forced, the router bypasses scoring and directly selects the specified backend. All other aspects (tool selection, semantic matching) remain unchanged.

Context-Aware Pruning

The ROUTING_BUDGET_TOKEN_LIMIT (default: 8000) controls the maximum token budget for a single routing decision:

All candidate tools are ranked by composite score
Each tool's estimated token cost is subtracted from the budget
Tools that would exceed the budget are pruned from the candidate set
The remaining top-scoring tool is selected

This prevents expensive tool chains from consuming more tokens than the user's budget allows.

CLI Commands

Test Routing

engram route test "deploy to production"

Shows the optimal routing decision (chosen tool, backend, confidence, latency, cost, reasoning) and an alternatives table.

List Performance Stats

engram route list

Shows all tools with historical performance data aggregated across the rolling window.

What's Next

Self-Healing Engine — How healed tools feed back into routing
Observability & Tracing — Trace routing decisions
Configuration — Tune routing weights for your workload

Protocol Federation

Engram bridges four agent communication protocols — MCP, CLI, A2A, and ACP — through a single semantic ontology layer. This enables seamless cross-protocol translation, multi-agent handoffs, and intent-based routing without brittle point-to-point integrations.

Supported Protocols

Protocol	Full Name	Use Case
MCP	Model Context Protocol	Structured tool invocations with JSON schemas
CLI	Command-Line Interface	Shell command execution and output parsing
A2A	Agent-to-Agent	Inter-agent communication and task delegation
ACP	Agent Communication Protocol	Standardized agent messaging framework

Each protocol has a dedicated connector in the orchestrator that translates to and from the canonical ontology form.

Translation Architecture

Every cross-protocol translation follows a three-stage pipeline:

Source Protocol → Canonical Bridge (OWL Ontology) → Target Protocol

Source normalization — The connector for the source protocol extracts semantic meaning from the payload and maps it to ontology concepts
Canonical representation — The payload exists as a protocol-neutral, ontology-backed intermediary form
Target denormalization — The connector for the target protocol translates from ontology concepts to the target's naming conventions

Example: MCP → CLI Translation

// Source (MCP)
{"name": "get_weather", "arguments": {"city": "San Francisco", "units": "imperial"}}

// Canonical Bridge (Ontology)
{"concept": "WeatherQuery", "location": "San Francisco", "measurement_system": "imperial"}

// Target (CLI)
{"command": "weather", "flags": ["--city", "San Francisco", "--units", "imperial"]}

CLI Command

# Translate with demo payload
engram protocol translate --from mcp --to cli

# Translate with custom payload
engram protocol translate --from a2a --to mcp --payload '{"task": "search", "query": "AI news"}'

# Translate from file
engram protocol translate --from cli --to a2a --payload ./request.json

The Orchestrator

The Orchestrator class (app/services/orchestrator.py) is the central coordinator for protocol operations:

`handoff_async()`

The primary method for cross-protocol execution:

Protocol detection — Determines the source and target protocols from the request
Connector dispatch — Routes to the appropriate protocol connector
Canonical translation — Normalizes through the ontology bridge
Execution — Invokes the target protocol's execution path
Proof generation — Creates a verifiable execution proof with trace data

Connector Registry

Each protocol has a registered connector that implements:

Method	Purpose
`to_canonical(payload)`	Convert protocol-specific payload to ontology form
`from_canonical(canonical)`	Convert ontology form to protocol-specific payload
`execute(payload)`	Execute the translated payload
`health_check()`	Verify connector availability

Protocol Connectors

MCP Connector

Translates between MCP tool call format and the canonical ontology form. Handles:

Tool name resolution
Argument schema validation
Response structure mapping
Error code translation

CLI Connector

Translates between shell commands and the canonical ontology form. Handles:

Command assembly from arguments
Flag formatting (short -f vs long --flag)
Output parsing (JSON, table, plain text)
Exit code interpretation

A2A Connector

Translates between Agent-to-Agent messages and the canonical ontology form. Handles:

Task delegation format
Agent capability discovery
State transfer between agents
Acknowledgment protocol

MiroFish Connector

A specialized connector for the MiroFish multi-agent orchestration framework. Handles:

Swarm-level task distribution
Multi-agent consensus
Hierarchical delegation

Multi-Hop Handoffs

When a request traverses more than two protocols (e.g., A2A → MCP → CLI), each hop goes through the ontology bridge:

A2A → Canonical → MCP → Canonical → CLI

Intermediate normalization ensures:

No information loss between hops
Semantic consistency across all three protocols
Complete trace lineage for debugging

Session Handoff Simulation

Test multi-agent handoffs without committing real resources:

engram protocol handoff simulate --source-agent CLI-Local --target-agent Remote-MCP

The simulation:

Creates a temporary session with a unique session ID
Evaluates semantic readiness (can the target agent handle this protocol?)
Lists bridged protocols (what translations are needed)
Transfers state through Redis-backed persistence
Reports success/failure with full state dump

Output

╭──── [*] Multi-Agent Federation Detail ────────────────────╮
│ 🤝 Handoff Simulation: CLI-Local -> Remote-MCP            │
│ ├── Session ID: 9a7b3c1d-...                              │
│ ├── Semantic Readiness: READY                              │
│ ├── Bridged Protocols                                      │
│ │   ├── CLI                                                │
│ │   └── MCP                                                │
│ └── Transferred State (Redis-backed)                       │
│     ├── Context                                            │
│     │   └── {"task_history": [...], "active_tools": [...]} │
│     ├── Artifacts                                          │
│     │   └── {"files": [], "data": {}}                      │
│     └── Semantic                                           │
│         └── {"ontology_version": "1.0", "mappings": {...}} │
╰───────────────────────────────────────────────────────────╯

Intent Resolution

The IntentResolver class handles natural-language to structured protocol mapping:

Input — A free-form natural language request (e.g., "send a notification about the deployment")
Intent classification — The LLM classifies the intent category (messaging, deployment, data query, etc.)
Protocol selection — Based on the intent, the resolver determines the best target protocol
Structured translation — The natural language is translated into a protocol-specific structured payload

This is what enables users to submit tasks in plain English through the TUI command input, and have them automatically routed to the right protocol and tool.

Execution Proofs

Every cross-protocol translation generates a verifiable execution proof containing:

Field	Description
`trace_id`	Unique identifier for the translation
`source_protocol`	Origin protocol
`target_protocol`	Destination protocol
`canonical_form`	The intermediate ontology representation
`field_mappings`	All field translations that occurred
`ontology_version`	Version of the ontology used
`timestamp`	When the translation was performed
`success`	Whether the translation succeeded

These proofs are stored in the trace system and can be queried via engram trace detail.

Delegation Engine

The DelegationEngine (delegation/engine.py) orchestrates agent-to-agent task delegation:

How It Works

Natural-language intent parsing — The user's task is analyzed for subtask decomposition
Agent capability matching — Available agents are evaluated for their ability to handle each subtask
Subtask routing — Each subtask is assigned to the best-fit agent
Execution coordination — Agents execute their subtasks with progress reporting to the TUI
Result aggregation — Subtask results are combined into a unified response

Swarm Memory Integration

The delegation engine uses Swarm Memory (bridge/memory.py) for:

Fact persistence — Subtask context is stored as semantic facts
Conflict resolution — When multiple agents produce conflicting results, Prolog-based reasoning resolves the conflict
Ontology-backed normalization — Cross-agent facts are normalized through the shared ontology

CLI Commands

# Translate between protocols
engram protocol translate --from mcp --to cli
engram protocol translate --from a2a --to mcp --payload '{"task": "search"}'

# Simulate multi-agent handoff
engram protocol handoff simulate
engram protocol handoff simulate --source-agent CLI-Local --target-agent Remote-MCP

What's Next

EAT Identity & Security — How tokens are scoped for cross-protocol access
Bidirectional Sync — Event-driven sync across protocols
Architecture — System-level view of the orchestrator

EAT Identity & Security

Engram uses a unified token system called EAT (Engram Authorization Token) that carries both structured permissions per tool and semantic scopes derived from the OWL ontology. This page covers authentication, authorization, token lifecycle, credential storage, and the security middleware stack.

What is EAT?

An EAT is a JWT-based token that carries three types of authorization data:

Identity (sub claim) — Who you are (email or user ID)
Structured Permissions (scopes claim) — Per-tool permissions as a nested object
Semantic Scopes (semantic_scopes claim) — Ontology-derived capabilities from security.owl

Token Structure

{
  "sub": "user@company.com",
  "jti": "unique-token-id",
  "exp": 1712345678,
  "scopes": {
    "slack": ["send_message", "list_channels"],
    "docker": ["run", "ps", "images"]
  },
  "semantic_scopes": [
    "execute:tool-invocation",
    "read:ontology-metadata"
  ]
}

Claim	Type	Description
`sub`	`str`	User identity (email or UUID)
`jti`	`str`	Unique token ID for revocation tracking
`exp`	`int`	Expiration timestamp (Unix epoch)
`scopes`	`dict`	Per-tool permissions: `{"tool_name": ["action1", "action2"]}`
`semantic_scopes`	`list`	Ontology-based capabilities: `["execute:tool-invocation"]`

Authentication Flow

# Via CLI
engram auth login

# Via TUI
engram run --debug   # Uses inline login form

The flow:

Signup — POST /auth/signup with email and password → Creates user record
Login — POST /auth/login with email and password → Returns a session access_token
EAT Generation — POST /auth/tokens/generate-eat with the session token → Returns the EAT

Step 2: Token Storage

EAT tokens are stored securely using a priority chain:

Priority	Method	Where	Security
1	System keyring	OS credential store	Highest (OS-managed encryption)
2	Config fallback	`~/.engram/config.yaml`	Medium (file permissions)
3	TUI encrypted	`~/.engram/config.enc`	High (Fernet symmetric encryption)

The CLI uses the keyring library to store tokens in:

macOS — Keychain
Windows — Credential Locker
Linux — Secret Service (GNOME Keyring / KWallet)

If the system keyring is unavailable, tokens fall back to the config.yaml file.

Token Lifecycle

EATIdentityService

The EATIdentityService (app/services/eat_identity.py) manages the full token lifecycle:

Issue

result = EATIdentityService.issue_token(
    db=session,
    user_id="user-uuid",
    permissions={"slack": ["send_message"]},
    semantic_scopes=["execute:tool-invocation"],
)
# Returns: EATIssueResult(token=..., refresh_token=..., expires_at=..., jti=...)

Refresh

result = EATIdentityService.refresh_token(
    db=session,
    refresh_token="refresh-uuid",
    permissions={"slack": ["send_message"]},
)

Refresh tokens are:

Stored in Redis with the hash of the token as the key
Single-use (consumed on refresh, a new refresh token is issued)
Expire in EAT_REFRESH_TOKEN_EXPIRE_MINUTES (default: 7 days)

Revoke

EATIdentityService.revoke_eat(
    db=session,
    user_id="user-uuid",
    token="eat-jwt",
    jti="token-jti",
    expires_in=900,
    refresh_token="refresh-uuid",
)

Revocation:

Adds the JTI to the Redis deny list (checked on every request)
Deletes the refresh token from Redis
Creates an audit log entry with event type REVOKED

Token Expiration

Token Type	Default Lifetime	Configuration
Session token	7 days	`ACCESS_TOKEN_EXPIRE_MINUTES`
EAT access token	15 minutes	`EAT_ACCESS_TOKEN_EXPIRE_MINUTES`
EAT refresh token	7 days	`EAT_REFRESH_TOKEN_EXPIRE_MINUTES`

The EAT access token is intentionally short-lived (15 minutes by default) because it carries permissions. The refresh token allows silent renewal without re-authentication.

Semantic Scopes

Semantic scopes are derived from security.owl and provide ontology-backed access control:

Scope	Ontology Context	Capability
`execute:tool-invocation`	Global	Can invoke cross-protocol tool translations
`read:ontology-metadata`	Global	Can query ontology metadata and tool catalogs
`write:tool-registry`	Global	Can register and modify tools
`admin:system`	Global	Full administrative access

Fail-Closed Semantics

Setting	Default	Behavior
`AUTH_FAIL_CLOSED`	`true`	When Redis is down and the JTI deny list can't be checked, deny access
`SEMANTIC_AUTH_FAIL_CLOSED`	`true`	When semantic scope verification fails, deny access

This fail-closed design ensures that security checks never silently pass due to infrastructure failures.

Viewing Scopes

# Full identity tree with scopes
engram auth whoami

# Tabular scope view with ontology context
engram auth scope

Token Audit Trail

Every token event is logged in the TokenAuditLog database table:

Event Type	When
`ISSUED`	New EAT is generated
`REFRESHED`	EAT is renewed via refresh token
`REVOKED`	EAT is explicitly revoked

Each audit record captures:

User ID, token type, JTI
Token hash (SHA-256, not the actual token)
Issued and expiration timestamps
Current scopes and semantic scopes
Additional metadata

Credential Storage

Provider credentials (API keys for Claude, Slack, Perplexity, etc.) are stored separately from EAT tokens:

CredentialService

The CredentialService (app/services/credentials.py) encrypts and manages provider credentials:

await CredentialService.save_credential(
    db=session,
    user_id=user_uuid,
    provider_name="claude",
    token="sk-ant-...",
    credential_type=CredentialType.API_KEY,
)

Feature	Detail
Encryption	Fernet symmetric encryption via `CryptoService`
Encryption key	`PROVIDER_CREDENTIALS_ENCRYPTION_KEY` environment variable
Auto-refresh	OAuth tokens are automatically refreshed on expiration
Per-user isolation	Each user has their own credential set

Supported Auth Types

Type	Examples	Storage
`api_key`	Anthropic, OpenAI, Perplexity	Encrypted API key
`oauth`	Slack, Google	Encrypted access + refresh tokens

TUI Vault Service

The TUI uses a separate vault (tui/vault_service.py) for credential storage, backed by Fernet-encrypted files at ~/.engram/config.enc. This is designed for environments where the system keyring isn't available.

Security Headers

The following security headers are injected on every response:

Header	Value	Purpose
`X-Content-Type-Options`	`nosniff`	Prevent MIME type sniffing
`X-Frame-Options`	`DENY`	Prevent clickjacking
`X-XSS-Protection`	`1; mode=block`	Enable browser XSS filter
`Strict-Transport-Security`	`max-age=31536000`	Force HTTPS (when `HTTPS_ONLY=true`)
`Content-Security-Policy`	`default-src 'self'`	Restrict resource loading

Rate Limiting

API rate limiting is powered by slowapi:

Setting	Default	Description
`RATE_LIMIT_DEFAULT`	`100/minute`	Default limit per IP
`RATE_LIMIT_ENABLED`	`true`	Toggle rate limiting

Rate limit headers are returned in every response:

X-RateLimit-Limit — Maximum requests per window
X-RateLimit-Remaining — Remaining requests in current window
X-RateLimit-Reset — Window reset timestamp

Security Middleware Stack

The FastAPI middleware pipeline processes requests in this order:

CORS — Cross-origin request handling (CORS_ORIGINS)
HTTPS Redirect — Forces HTTPS when HTTPS_ONLY=true
Security Headers — Injects all security headers
Rate Limiting — Enforces per-IP rate limits
JWT Validation — Verifies EAT token signature and claims
Semantic Scope Check — Validates semantic scopes against the requested operation
Prometheus Instrumentation — Records request metrics

CLI Commands

# Authenticate
engram auth login
engram auth login --token <eat-token>

# View identity and permissions
engram auth whoami
engram auth scope
engram auth status

# Manually set token
engram auth token-set <token>

What's Next

Configuration — Configure security settings
Architecture — System-level security architecture
SDK & Python Library — Programmatic authentication

SDK & Python Library

The Engram SDK provides programmatic access to all Engram capabilities — authentication, tool registration, translation, task execution, and agent management. Use it to integrate Engram into your Python applications or build custom agent workflows.

Installation

pip install engram-sdk

Or import directly from the engram_sdk/ package if you're developing within the monorepo:

from engram_sdk.client import EngramSDK

Quick Start

from engram_sdk.client import EngramSDK

# Initialize and connect
sdk = EngramSDK(
    base_url="http://127.0.0.1:8000",
    email="user@company.com",
    password="your-password"
)

# Authenticate
sdk.connect()
sdk.login()
eat = sdk.generate_eat()
print(f"Authenticated with EAT: {eat[:20]}...")

# Register a tool
from engram_sdk.client import ToolDefinition, ToolAction

tool = ToolDefinition(
    name="Weather Checker",
    description="Get current weather for any city",
    actions=[
        ToolAction(
            name="get_current",
            description="Get current weather",
            parameters={"city": {"type": "string", "required": True}},
            endpoint="/v1/current",
            method="GET"
        )
    ]
)
result = sdk.register_tool(tool)
print(f"Registered: {result}")

# Translate between protocols
translation = sdk.translate(
    payload={"name": "get_weather", "arguments": {"city": "London"}},
    source_protocol="mcp",
    target_protocol="cli"
)
print(f"Translated: {translation}")

Authentication

The AuthClient (via engram_sdk/auth.py) handles the full authentication lifecycle:

sdk.login()  # Uses email/password from initialization

sdk.signup()  # Creates a new account, then logs in

EAT Generation

eat = sdk.generate_eat()

Token Refresh

sdk.refresh_eat()  # Automatically refreshes if expired

Auto-Retry on Expiration

The EngramTransport layer automatically detects 401 responses, refreshes the EAT token, and retries the request. No manual token management needed in most cases.

Tool Registration

Single Tool

from engram_sdk.client import ToolDefinition, ToolAction

tool = ToolDefinition(
    name="My API",
    description="My custom API integration",
    actions=[
        ToolAction(
            name="create_item",
            description="Create a new item",
            parameters={
                "name": {"type": "string", "required": True},
                "category": {"type": "string", "required": False}
            },
            endpoint="/items",
            method="POST"
        )
    ]
)

result = sdk.register_tool(tool)

Batch Registration

tools = [tool1, tool2, tool3]
results = sdk.register_tools(tools)

ToolDefinition Dataclass

Field	Type	Description
`name`	`str`	Human-readable tool name
`description`	`str`	What the tool does
`actions`	`List[ToolAction]`	Available actions/endpoints
`tags`	`List[str]`	Semantic tags for discovery
`base_url`	`str`	Optional base URL override

ToolAction Dataclass

Field	Type	Description
`name`	`str`	Action name
`description`	`str`	What this action does
`parameters`	`Dict`	Parameter definitions with types
`endpoint`	`str`	API endpoint path
`method`	`str`	HTTP method (GET, POST, etc.)

Agent Registration

sdk.register_agent(
    agent_id="my-agent-001",
    endpoint_url="http://my-agent:5000/webhook",
    supported_protocols=["mcp", "a2a"],
    capabilities=["messaging", "data_processing"],
    tags=["production", "v2"]
)

Translation

# Protocol-to-protocol translation
result = sdk.translate(
    payload={"name": "send_message", "arguments": {"text": "Hello"}},
    source_protocol="mcp",
    target_protocol="a2a"
)

print(result.translated_payload)
print(result.canonical_bridge)
print(result.field_mappings)

TranslationResponse

Field	Type	Description
`translated_payload`	`Dict`	The payload in the target protocol format
`canonical_bridge`	`Dict`	The intermediate ontology representation
`field_mappings`	`Dict`	Source → target field translations
`ontology_version`	`str`	Ontology version used

Task Execution

The SDK supports both submitting tasks and receiving them as an agent:

Submit a Task

result = sdk.submit_task("Deploy the application to staging")
print(f"Task ID: {result.task_id}")
print(f"Status: {result.status}")

Receive and Execute Tasks (Agent Loop)

from engram_sdk.client import TaskExecutor

executor = TaskExecutor(sdk)

# Poll for tasks
while True:
    task = sdk.receive_task()
    if task:
        # Execute the task
        result = my_tool.execute(task.command)
        
        # Send response back
        sdk.send_response(
            task_id=task.task_id,
            result=result,
            status="completed"
        )

TaskExecution Dataclass

Field	Type	Description
`task_id`	`str`	Unique task identifier
`command`	`str`	The task to execute
`lease_expires_at`	`datetime`	When the lease expires
`attempt`	`int`	Current attempt number

TaskResponse Dataclass

Field	Type	Description
`task_id`	`str`	Task identifier
`result`	`Any`	Execution result
`status`	`str`	`completed`, `failed`, `dead_letter`
`error`	`str`	Error message if failed

Transport Layer

The EngramTransport class (engram_sdk/transport.py) manages HTTP communication:

Feature	Detail
Auto-retry	Retries on transient errors (500, 502, 503, 504)
Token refresh	Automatically refreshes EAT on 401
Health check	`sdk.ping()` to verify connectivity
Timeout	Configurable request timeout (default: 30s)

# Health check
if sdk.ping():
    print("Connected to Engram gateway")

Type Reference

All SDK types are defined as Python dataclasses:

Type	Purpose
`ToolDefinition`	Complete tool definition for registration
`ToolAction`	Individual action within a tool
`TaskLease`	Leased task for execution
`TaskExecution`	Task execution context
`TaskResponse`	Response to submit after execution
`TranslationResponse`	Result of a protocol translation
`MappingSuggestion`	ML-suggested field mapping
`TaskSubmissionResult`	Result of submitting a new task

Error Handling

The SDK raises typed exceptions:

Exception	When
`EngramSDKError`	Base class for all SDK errors
`EngramAuthError`	Authentication failure (invalid credentials, expired token)
`EngramRequestError`	Network or HTTP error (connection refused, timeout)
`EngramResponseError`	Unexpected response from the gateway (400, 500)

from engram_sdk.client import EngramSDK, EngramAuthError, EngramRequestError

try:
    sdk.login()
except EngramAuthError as e:
    print(f"Auth failed: {e}")
except EngramRequestError as e:
    print(f"Network error: {e}")

Example: Full Agent Loop

from engram_sdk.client import EngramSDK, ToolDefinition, ToolAction

# 1. Initialize and authenticate
sdk = EngramSDK(
    base_url="http://127.0.0.1:8000",
    email="agent@company.com",
    password="agent-password"
)
sdk.connect()
sdk.login()
sdk.generate_eat()

# 2. Register this agent
sdk.register_agent(
    agent_id="weather-agent",
    endpoint_url="http://localhost:5001",
    supported_protocols=["mcp"],
    capabilities=["weather_queries"]
)

# 3. Register tools
sdk.register_tool(ToolDefinition(
    name="Weather Service",
    description="Real-time weather data",
    actions=[
        ToolAction(
            name="current",
            description="Get current weather",
            parameters={"city": {"type": "string", "required": True}},
            endpoint="/weather/current",
            method="GET"
        )
    ]
))

# 4. Task execution loop
print("Agent ready. Polling for tasks...")
while True:
    task = sdk.receive_task()
    if task:
        print(f"Received task: {task.command}")
        try:
            # Execute the task (your custom logic here)
            result = {"temperature": 72, "city": "San Francisco", "unit": "F"}
            sdk.send_response(task_id=task.task_id, result=result, status="completed")
            print(f"Task {task.task_id} completed")
        except Exception as e:
            sdk.send_response(task_id=task.task_id, result=None, status="failed", error=str(e))

What's Next

Architecture — Understand the system internals
EAT Identity & Security — Configure authentication
CLI Reference — CLI counterparts of SDK operations

Bidirectional Sync & Events

Engram's event system enables real-time bidirectional synchronization between connected tools and the bridge. Events flow through Redis Streams with semantic normalization, ontology-backed conflict resolution, and live monitoring via the CLI and TUI.

Event Architecture

Events in Engram flow through a Redis Streams pipeline:

Tool/Agent → Event Emission → Redis Stream (engram:events) → Consumer Group → Event Handlers → Trace/TUI

Component	Implementation	Purpose
Stream Key	`engram:events`	Central event stream
Consumer Group	`engram-event-workers`	Ensures exactly-once processing
Consumer	`worker-1`	Individual consumer within the group
Block Timeout	2000ms	How long to wait for new events
Batch Size	25	Events processed per read
Max Length	10,000	Stream trimming limit

When Redis is unavailable (local dev), the system falls back to a polling listener with configurable interval (EVENT_POLL_INTERVAL_SECONDS, default: 10s).

Polling Listeners

HTTP endpoint polling for tools that don't support webhooks:

engram sync add <tool-id> --type polling --url https://api.example.com/changes --interval 30

Setting	Type	Description
`--url`	`str`	URL to poll for changes
`--interval`	`int`	Polling interval in seconds (default: 60)
`--direction`	`str`	Sync direction: `both`, `to_mcp`, `from_mcp`

Polled data is semantically normalized through the ontology before being stored or forwarded.

CLI Watch

Monitor file system changes or command output in real time:

engram sync add <tool-id> --type cli_watch --command "docker ps --format json"

The CLI watch service:

Executes the specified command at regular intervals
Compares output against the previous execution
Detects structural changes (new fields, removed fields, value changes)
Emits events for detected changes
Applies semantic normalization before storage

Bidirectional Sync

Tools can be synchronized in both directions:

Direction	Data Flow	Use Case
`both`	Tool ↔ Bridge	Full two-way sync (default)
`to_mcp`	Tool → Bridge	Read-only import from external source
`from_mcp`	Bridge → Tool	Push changes from bridge to external tool

Bidirectional sync ensures that changes made in either the tool or the bridge are reflected in both systems. The ontology handles field name translation between the tool's native format and the bridge's canonical format.

Semantic Conflict Resolution

When events from multiple sources conflict, Engram uses a multi-layer resolution strategy:

Prolog-Based Reasoning

The bridge/memory.py module uses pyswip (SWI-Prolog bindings) for semantic fact reasoning:

Facts are asserted as Prolog terms: fact(concept, subject, predicate, value, timestamp)
Conflict detection queries: "Are there two facts about the same subject with different values?"
Resolution rules: Ontology-backed reasoning determines which value takes precedence

pyDatalog Rules

For simpler conflict scenarios, pyDatalog provides declarative last-write-wins rules:

Most recent timestamp wins by default
Configurable to prefer specific sources over others
Cross-agent facts are reconciled through the shared ontology

Resolution Priority

Ontology authority — If the ontology defines a canonical value, it wins
Recency — More recent writes take precedence
Source trust — Configurable per-source trust levels
Manual override — User can explicitly resolve conflicts via the API

Event Normalization

Events from different sources are normalized through the ontology before storage:

Raw Event → Field Flattening → Ontology Lookup → Canonical Form → Storage

This ensures that events from different tools about the same concepts are stored in a consistent format, enabling cross-tool queries and aggregation.

Swarm Memory

Swarm Memory (bridge/memory.py) is a persistent, ontology-aware fact store:

Layer	Technology	Purpose
Persistence	SQLite	Durable fact storage
Reasoning	SWI-Prolog (`pyswip`)	Semantic inference and conflict detection
Rules	pyDatalog	Declarative conflict resolution
Normalization	SemanticMapper	Ontology-backed concept normalization

Key Operations

Method	Purpose
`store_fact(concept, data)`	Store a semantic fact with ontology alignment
`query_facts(concept, filters)`	Query facts with Prolog-backed reasoning
`resolve_conflict(facts)`	Apply conflict resolution rules
`get_context(agent_id)`	Retrieve all facts relevant to an agent's current task

CLI Commands

# List active listeners
engram sync list

# Add polling sync
engram sync add <tool-uuid> --type polling --url https://api.example.com/changes --interval 30

# Add CLI watch
engram sync add <tool-uuid> --type cli_watch --command "docker ps --format json"

# Live event monitoring (auto-refresh)
engram sync status

Live Monitoring

engram sync status uses Rich's Live display to show a continuously updating event table:

        Live Event Stream
┌──────────┬──────────┬───────────┬─────────────┬──────────────┐
│ Time     │ Tool     │ Type      │ Entity Key  │ Conflict Res │
├──────────┼──────────┼───────────┼─────────────┼──────────────┤
│ 14:23:01 │ 8b4c3d2e │ update    │ user-123    │ semantic-mtch│
│ 14:22:58 │ 7a3f2b1c │ create    │ order-456   │ semantic-mtch│
│ 14:22:55 │ 8b4c3d2e │ delete    │ item-789    │ semantic-mtch│
└──────────┴──────────┴───────────┴─────────────┴──────────────┘
Monitoring live events... Press Ctrl+C to stop.

API Endpoints

Endpoint	Method	Purpose
`/events/listeners`	`GET`	List active listeners and watchers
`/events/sync`	`POST`	Add a new sync configuration
`/events/recent`	`GET`	Get recent events for live monitoring

What's Next

Observability & Tracing — Monitor event processing
Architecture — System-level event architecture
Configuration — Configure event stream settings

Observability & Tracing

Every tool execution in Engram is semantically traced — not just "what happened" but "why it happened." Traces capture routing decisions, ontology alignment, healing steps, field mappings, and performance metrics. Combined with Prometheus metrics, Grafana dashboards, Sentry integration, and structured logging, you get complete observability over the entire system.

What Gets Traced

Every tool execution captures:

Data Point	Description
Tool Selection	Which tool was chosen and why
Routing Choice	Which backend (MCP/CLI) was selected
Backend Used	Actual backend that executed the task
Latency	End-to-end execution time in milliseconds
Success / Failure	Whether the execution completed without error
Token Cost	Estimated token consumption
Similarity Score	Semantic similarity between task and tool
Composite Score	Final routing score with all weights applied
Reconciliation Steps	Any self-healing steps taken during execution
Field Mappings	Source-to-target field translations
Ontological Interpretation	Ontology concept used for alignment
Error Stack	Full error details if the execution failed

Trace Storage

Traces are persisted via POST /api/v1/traces and stored in the database. Each trace record includes the full execution context above plus a unique trace_id for retrieval.

Retention

By default, traces are kept indefinitely. For high-volume deployments, configure retention policies via database cleanup jobs or set up Prometheus recording rules for long-term metrics.

Querying

# List recent traces
engram trace list --limit 50

# Filter by tool
engram trace list --tool slack

# Export as JSON for external analysis
engram trace list --export > traces.json

Natural-Language Summaries

The trace detail view includes an AI-generated summary that explains routing and healing decisions in plain English:

engram trace detail .

The summary is generated by POST /api/v1/traces/query using the configured LLM. It reads the trace data and produces a narrative like:

"The Slack tool was selected via MCP backend with 87.3% confidence. Semantic similarity was strong (0.82) due to the 'messaging' concept match in the ontology. No schema drift was detected. The execution completed in 245ms with 12.5 tokens consumed."

CLI Trace Commands

List Traces

engram trace list [--limit 20] [--tool <name>] [--export]

Output:

            Recent Semantic Traces
╭─────────────────────┬──────────┬──────────┬─────────┬─────────┬────────╮
│ Timestamp           │ Trace ID │ Tool     │ Backend │ Success │ Tokens │
├─────────────────────┼──────────┼──────────┼─────────┼─────────┼────────┤
│ 2026-04-08 14:23:01 │ 7a3f2b1c │ Slack    │ MCP     │ PASS    │     13 │
│ 2026-04-08 14:22:45 │ 8b4c3d2e │ docker   │ CLI     │ PASS    │      2 │
│ 2026-04-08 14:21:30 │ 9c5d4e3f │ Weather  │ MCP     │ FAIL    │     18 │
╰─────────────────────┴──────────┴──────────┴─────────┴─────────┴────────╯

Detail View

engram trace detail <trace_id>   # Specific trace
engram trace detail .             # Latest trace
engram trace detail . --export    # Export as JSON

Output structure:

╭──── 🤖 Routing & Healing Summary ────────────────────────────╮
│ The Slack tool was selected via MCP backend with 87.3%        │
│ confidence. No schema drift detected. Execution completed     │
│ in 245ms.                                                     │
╰──────────────────────────────────────────────────────────────╯

╭──── [*] Full Semantic Inspection ────────────────────────────╮
│ 🔍 Semantic Trace: 7a3f2b1c                                  │
│ ├── 📍 Execution Path [PASS]                                 │
│ │   ├── Tool Selection: Slack                                 │
│ │   ├── Routing Choice: MCP                                   │
│ │   ├── Actual Backend: MCP                                   │
│ │   └── Latency: 245.0ms                                     │
│ ├── 📊 Performance Weights                                   │
│ │   ├── Semantic Similarity: 0.823                            │
│ │   ├── Composite Score: 0.871                                │
│ │   └── Token Efficiency: 12.5 tokens                        │
│ ├── 🔧 Self-Healing Steps                                    │
│ │   └── No drift detected; no healing required.              │
│ └── 🔬 Ontological Alignment                                 │
│     ├── Context: messaging.send_message                       │
│     └── Synthesized Field Mappings                            │
│         ├── channel → channel_id                              │
│         └── text → message_body                               │
╰──────────────────────────────────────────────────────────────╯

Prometheus Metrics

Engram exposes Prometheus metrics at /metrics via prometheus-fastapi-instrumentator:

Available Metrics

Metric	Type	Description
`http_request_duration_seconds`	Histogram	Request latency per endpoint
`http_requests_total`	Counter	Total requests per endpoint and status
`http_request_size_bytes`	Histogram	Request body size
`http_response_size_bytes`	Histogram	Response body size

Custom Application Metrics

Additional metrics can be added via the instrumentator's callback hooks. Common custom metrics include:

Tool execution count by backend
Routing decision distribution (MCP vs CLI)
Self-healing repair count
Circuit breaker trip count
Task queue depth

Configuration

# In app/main.py
from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app)

Scrape with Prometheus:

# prometheus.yml
scrape_configs:
  - job_name: 'engram'
    scrape_interval: 15s
    static_configs:
      - targets: ['app:8000']

Grafana Dashboards

Pre-built dashboards are auto-provisioned from monitoring/grafana/dashboards/:

Dashboard	Metrics
Request Overview	Rate, latency (p50/p95/p99), error rate per endpoint
Tool Routing	Backend selection distribution, confidence scores, cache hit rate
Self-Healing	Drift detection frequency, auto-repair success rate, review queue depth
Circuit Breaker	Trip count, cooldown events, per-destination failure rate
Task Queue	Queue depth, processing latency, lease expiration rate

Access at http://localhost:3001 (default credentials: admin/admin).

Alert Configuration

Set up Grafana alerts for critical conditions. Alerts can be sent via email (SMTP), Slack, PagerDuty, or webhook.

Sentry Integration

# .env
SENTRY_DSN=https://...@sentry.io/...

When configured, Sentry captures:

Unhandled exceptions with full stack traces
Performance traces with traces_sample_rate
Profile data with profiles_sample_rate

Structured Logging

Engram uses structlog for structured, machine-parseable logging:

import structlog
logger = structlog.get_logger(__name__)

logger.info("Translating message", source_protocol="MCP", target_protocol="CLI")

Log Levels

Level	When to use
`DEBUG`	Detailed diagnostic information, only visible when `LOG_LEVEL=DEBUG`
`INFO`	Standard operational events (tool execution, routing decisions)
`WARNING`	Degraded conditions (Redis unavailable, token nearing expiration)
`ERROR`	Failed operations (translation failure, authentication error)
`CRITICAL`	System-level failures (database connection lost, startup failure)

JSON Format for Production

In production (ENVIRONMENT=production), logs are output as JSON for log aggregation tools (ELK, Datadog, CloudWatch):

{"event": "Translating message", "source_protocol": "MCP", "target_protocol": "CLI", "timestamp": "2026-04-08T14:23:01Z", "level": "info"}

TUI Dashboard

The TUI (engram run --debug) provides real-time observability through dedicated trace panels:

Panel	Content
Connections	Live connection events with timestamps
Agent Execution	Agent step events during orchestration
Tool Usage	Tool invocations with payload summaries
Responses	Final responses from tools
Translation	Three-panel view: Engram Task → Tool Request → Tool Response
Log View	Full timestamped log stream of all events

The TUI Bridge (app/core/tui_bridge.py) translates technical structlog events into plain-English messages with emojis for human readability.

What's Next

Self-Healing Engine — How traces feed into drift detection
Configuration — Configure Sentry, log level, and Prometheus
Docker & Kubernetes Setup — Deploy the monitoring stack

Self-Evolving Tools

Engram's evolution pipeline uses ML to continuously improve tool definitions based on execution history. It proposes refinements to descriptions, parameter schemas, default values, and recovery strategies — then applies them through a human-reviewed (or auto-approved) workflow.

How Tools Evolve

The evolution pipeline follows a continuous improvement cycle:

Execution History → ML Analysis → Improvement Proposals → Human Review / Auto-Apply → Tool Registry Update

Execution History — Every tool execution is traced, including successes, failures, parameter values, and error types
ML Analysis — The evolution engine analyzes patterns across executions to identify improvement opportunities
Improvement Proposals — Concrete changes are generated with confidence scores
Review — High-confidence proposals above ML_AUTO_APPLY_THRESHOLD (default: 0.85) can be auto-applied. Others require manual review.
Registry Update — Approved proposals update the tool definition in the registry with a new semantic version

Improvement Types

Type	What Changes	Example
Description Refinement	Improved tool description based on actual usage	"Send a message" → "Send a formatted Slack message to a channel or user"
Parameter Schema Optimization	Tightened action schemas based on failure analysis	Adding `enum` constraints to a `format` parameter based on observed values
Default Value Tuning	Adjusted defaults based on most common parameter values	`units` default changed from `metric` to `imperial` if 90% of calls use imperial
Recovery Strategy Generation	Pattern-based automated fallback mapping	If tool X fails with error Y, retry with tool Z instead

Evolution Pipeline

The evolution pipeline runs as a background task:

Component	Technology	Purpose
Task Queue	Celery	Schedules periodic analysis jobs
Semantic Analysis	`transformers` + `torch`	Analyzes execution patterns for improvement signals
Versioning	Semantic Versioning (semver)	Each evolution increments the tool's version number
Storage	Database	Proposals are stored with full diff payloads

Trigger Conditions

Evolution analysis is triggered when:

A tool accumulates ML_AUTO_RETRAIN_THRESHOLD (default: 5) corrections from manual healing
A tool's success rate drops below a configurable threshold
Periodic scheduled analysis (configurable via workflow scheduler)

Confidence Scoring

Each improvement proposal is assigned a confidence score:

Score	Action
≥ 85% (`ML_AUTO_APPLY_THRESHOLD`)	Can be auto-applied (if enabled)
70% – 84%	Requires manual review but flagged as "recommended"
< 70%	Requires manual review, flagged as "uncertain"

The score is computed from:

Evidence strength — How many executions support this change
Consistency — How consistent the pattern is across different users/contexts
Impact — How much the change is expected to improve success rate

Review and Apply

Status Dashboard

engram evolve status

╭──── 🔬 Self-Evolving Tools Dashboard ───────────────────────╮
│ Improvement Pipeline Status: Active                          │
│ Pending Proposals: 3                                         │
│ Total Historical Evolutions: 47                              │
│ Last ML Update: 2026-04-08 13:45:00                         │
╰──────────────────────────────────────────────────────────────╯

         [*] Pending Tool Refinements
╭──────────────────┬──────────────────┬──────────────────────────┬───────┬──────────╮
│ Tool ID / Version│ Refinement Type  │ Proposed Changes         │ Conf. │ Prop. ID │
├──────────────────┼──────────────────┼──────────────────────────┼───────┼──────────┤
│ Slack            │ Description Path │ Description Path         │ 92.0% │ 7a3f2b1c │
│ v1.0 -> v1.1     │ Refinement      │ Refinement: Improved...  │       │          │
├──────────────────┼──────────────────┼──────────────────────────┼───────┼──────────┤
│ docker           │ Parameter Schema │ Parameter Schema         │ 78.5% │ 8b4c3d2e │
│ v2.3 -> v2.4     │ Optimization    │ Optimization: Action...  │       │          │
├──────────────────┼──────────────────┼──────────────────────────┼───────┼──────────┤
│ Weather API      │ New Recovery    │ New Recovery Strategy:    │ 65.0% │ 9c5d4e3f │
│ v1.0 -> v1.1     │ Strategy        │ Pattern-based fallback.. │       │          │
╰──────────────────┴──────────────────┴──────────────────────────┴───────┴──────────╯

Use engram evolve apply <id> to authorize a specific improvement.

Apply a Proposal

# Interactive apply with diff preview
engram evolve apply 7a3f2b1c

The command:

Fetches the proposal details from GET /api/v1/evolution/status
Shows a before/after diff for each changed field
Asks for confirmation (unless --force is used)
Applies the change via POST /api/v1/evolution/apply/<id>
Hot-redeploys the tool registry with the new version

# Skip confirmation
engram evolve apply 7a3f2b1c --force

Recovery Strategies

When ML analysis detects a recurring failure pattern, it generates recovery strategies:

Pattern	Strategy
Tool X fails with timeout	Retry with increased timeout, then fallback to CLI backend
Tool X fails with auth error	Refresh credentials and retry
Tool X returns malformed data	Apply field mapping correction from known-good execution
Tool X consistently fails after API update	Queue for re-registration with updated schema

Recovery strategies are stored as part of the tool definition and are automatically applied by the reliability middleware during execution.

Version History

Each tool maintains a version history:

Versions follow semver: major.minor.patch
Patch — Default value or description refinement
Minor — Parameter schema change or new recovery strategy
Major — Breaking schema change (rare, usually from re-registration)

Rollback to a previous version is supported through the evolution API.

CLI Commands

# View dashboard
engram evolve status

# Apply a proposal (interactive)
engram evolve apply <id>

# Apply without confirmation
engram evolve apply <id> --force

Architecture

This page provides a system-level walkthrough of all Engram components, data flows, and design decisions. It's intended for developers who want to understand the internals, contribute to the codebase, or build deep integrations.

System Overview

┌─────────────────┐     ┌──────────────────┐     ┌─────────────────────┐
│   Agents/Users  │     │   CLI / SDK / TUI │     │    Playground UI    │
│                 │     │                   │     │   (Vite + React)    │
└────────┬────────┘     └────────┬──────────┘     └────────┬────────────┘
         │                       │                          │
         ▼                       ▼                          ▼
┌────────────────────────────────────────────────────────────────────────┐
│                        Gateway API (FastAPI)                           │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐│
│  │   Auth   │ │  Registry│ │ Routing  │ │  Tasks   │ │ Federation  ││
│  │  Router  │ │  Router  │ │  Router  │ │  Router  │ │   Router    ││
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └─────────────┘│
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────┐│
│  │Discovery │ │  Events  │ │  Traces  │ │ Evolution│ │   Memory    ││
│  │  Router  │ │  Router  │ │  Router  │ │  Router  │ │   Router    ││
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘ └─────────────┘│
└────────────────────────────┬───────────────────────────────────────────┘
                             │
                             ▼
┌────────────────────────────────────────────────────────────────────────┐
│                         Orchestrator                                   │
│  ┌───────────────┐ ┌───────────────┐ ┌───────────────┐               │
│  │ MCP Connector │ │ CLI Connector │ │ A2A Connector │  ...          │
│  └───────────────┘ └───────────────┘ └───────────────┘               │
└──────────────────────────┬─────────────────────────────────────────────┘
                           │
              ┌────────────┼────────────┐
              ▼            ▼            ▼
┌──────────────────┐ ┌──────────┐ ┌──────────────┐
│  Semantic Layer  │ │   ML     │ │  Reliability │
│  (OWL + Mapper)  │ │  Layer   │ │  Middleware   │
└──────────────────┘ └──────────┘ └──────────────┘
              │            │            │
              ▼            ▼            ▼
┌──────────────────┐ ┌──────────┐ ┌──────────────┐
│  PostgreSQL /    │ │  Redis   │ │ Swarm Memory │
│  SQLite          │ │          │ │ (SQLite+Prolog)│
└──────────────────┘ └──────────┘ └──────────────┘

Gateway API

The FastAPI application (app/main.py) is the central entrypoint:

Lifespan Management

The FastAPI lifespan handler initializes and shuts down all services:

Startup:

Database engine and session factory creation
Table creation (SQLModel.metadata.create_all)
Alembic migration check
Orchestration service initialization
Background service startup (discovery, task worker, workflow scheduler, event listener)
Prometheus instrumentator setup
Sentry SDK initialization

Shutdown:

Background service cancellation
Database engine disposal
Redis connection cleanup

Middleware Stack

Requests pass through this middleware pipeline (in order):

CORS — CORSMiddleware with configurable origins
Security Headers — Custom middleware injecting X-Content-Type-Options, X-Frame-Options, etc.
HTTPS Redirect — Redirects HTTP to HTTPS when HTTPS_ONLY=true
Rate Limiting — slowapi with RATE_LIMIT_DEFAULT
Prometheus — prometheus-fastapi-instrumentator for metrics at /metrics

API Routers

The gateway registers 16+ router modules:

Router	Prefix	Purpose
`auth`	`/auth`	Login, signup, EAT generation, token management
`endpoints`	`/api/v1/endpoints`	General API endpoints
`discovery`	`/api/v1/discovery`	Agent and tool discovery
`permissions`	`/api/v1/permissions`	Permission management
`credentials`	`/credentials`	Provider credential storage
`orchestration`	`/api/v1/orchestration`	Task orchestration and handoffs
`tasks`	`/tasks`	Task submission and status
`workflows`	`/api/v1/workflows`	Workflow scheduling and management
`registry`	`/api/v1/registry`	Tool registration (OpenAPI, CLI, manual)
`events`	`/events`	Event listeners and sync
`tracing`	`/api/v1/traces`	Semantic execution traces
`catalog`	`/api/v1/catalog`	Pre-optimized tool catalog
`reconciliation`	`/api/v1/reconciliation`	Self-healing status and triggers
`routing`	`/api/v1/routing`	Routing tests and stats
`evolution`	`/api/v1/evolution`	Self-evolving tool proposals
`federation`	`/api/v1/federation`	Protocol translation and handoffs
`memory`	`/api/v1/memory`	Swarm memory queries

Orchestrator & Connectors

The Orchestrator class coordinates all protocol operations:

Protocol detection — Determines source and target protocols from request context
Connector dispatch — Routes to the appropriate protocol connector
Execution tracking — Creates trace records for every execution
Error handling — Classifies errors and triggers appropriate recovery strategies

IntentResolver

The IntentResolver translates natural-language requests into structured protocol payloads:

"send a notification about the deploy" → {"tool": "slack", "action": "send_message", "params": {"text": "..."}}

Semantic Layer

SemanticMapper

The core translation engine (app/semantic/semantic_mapper.py):

Field flattening — Nested JSON to dot-notation paths
Ontology resolution — resolve_equivalent() maps fields through OWL concepts
Bidirectional normalization — Translates payloads in both directions

OWL Ontology Management

Two ontologies power the semantic layer:

Ontology	File	Content
`protocols.owl`	`app/semantic/protocols.owl`	Protocol concepts, field semantics, equivalence relations
`security.owl`	`app/semantic/security.owl`	Permission concepts, semantic scopes, access control

Loaded via rdflib and owlready2, providing SPARQL queries and OWL reasoning.

BidirectionalNormalizer

Handles forward and reverse translation through the ontology bridge.

DynamicRuleSynthesizer

Uses the configured LLM to propose new mapping rules for novel field relationships not covered by the ontology.

ProfileSemanticMapper

Extends the base mapper with user-profile-aware semantic resolution.

ML Layer

ml_mapper.py

The ML-based field mapping model:

Algorithm — scikit-learn pipeline (TF-IDF vectorizer + classifier)
Training data — Labeled field mappings from successful executions
Model storage — Serialized to ML_MODEL_PATH via joblib
Auto-retraining — Triggered after ML_AUTO_RETRAIN_THRESHOLD corrections

train_mapping_model.py

Standalone training script for the mapping model. Can be run periodically or triggered by the evolution pipeline.

Routing Engine

The tool_routing.py module implements weighted composite routing:

Embedding generation — sentence-transformers converts task descriptions to vectors
Candidate scoring — Each tool/backend pair gets a composite score
Caching — Redis-backed cache with ROUTING_CACHE_TTL_SECONDS
Context pruning — Budget-based token limit pruning
Selection — Highest-scoring candidate is chosen (or parallel if below confidence gap)

Reliability Middleware

The reliability/middleware.py wraps all routing calls with:

Circuit breaker — Per-destination failure tracking with automatic cooldown
Retry with exponential backoff — Via tenacity library
Idempotency — SHA-256 payload hash + correlation ID for exactly-once semantics
Schema inference — Dynamic Pydantic model creation and validation
TUI trace logging — Real-time events for circuit breaker trips, retries, and failures

See Reliability Middleware for full details.

Reliability Middleware

The reliability middleware wraps every routing call with circuit breakers, retry logic, idempotency enforcement, and dynamic schema validation. This ensures that tool executions are robust, recoverable, and exactly-once — even when downstream APIs are flaky.

Overview

The ReliabilityMiddleware class (reliability/middleware.py) sits between the routing engine and actual tool execution:

Routing Decision → ReliabilityMiddleware → Tool Execution → Response Validation → Trace Recording

Every call through the middleware is:

Retried on transient failures with exponential backoff
Circuit-broken per destination — if a tool consistently fails, the circuit opens and prevents further calls until cooldown completes
Deduplicated via idempotency keys — the same request won't execute twice
Validated against a dynamically inferred schema — responses are checked for structural correctness

Circuit Breaker

How It Works

Each tool/backend destination has its own circuit breaker state:

State	Behavior
CLOSED	Normal operation — all requests pass through
OPEN	Requests are immediately rejected. No calls to the backend.
HALF-OPEN	A single probe request is allowed. If it succeeds, circuit closes. If it fails, circuit reopens.

Configuration

The circuit breaker is configured per-instance:

Parameter	Default	Description
Failure threshold	5	Number of consecutive failures before opening
Cooldown period	30 seconds	How long the circuit stays open before transitioning to half-open
Success threshold	1	Number of successes in half-open before closing

State Tracking

# From reliability/middleware.py
class CircuitBreaker:
    async def check_circuit(self, destination: str) -> bool:
        """Returns True if the circuit is closed (requests allowed)."""
        state = self._get_state(destination)
        if state == "OPEN":
            if time.time() - state.opened_at > self.cooldown:
                return True  # Transition to HALF-OPEN
            return False  # Still in cooldown
        return True  # CLOSED or HALF-OPEN

    async def record_success(self, destination: str): ...
    async def record_failure(self, destination: str): ...

TUI Integration

When a circuit breaker trips, the event is emitted to the TUI trace panel:

⚡ Circuit breaker OPENED for destination: Slack-MCP (5 consecutive failures)
🔄 Circuit breaker HALF-OPEN for destination: Slack-MCP (cooldown expired)
✅ Circuit breaker CLOSED for destination: Slack-MCP (probe succeeded)

Retry Logic

Retries use tenacity with exponential backoff:

@retry(
    wait=wait_exponential(multiplier=1, min=1, max=60),
    stop=stop_after_attempt(3),
    retry=retry_if_exception_type((httpx.ConnectError, httpx.TimeoutException)),
)
async def execute_with_retry(self, ...):
    ...

Parameter	Value	Description
Max attempts	3	Total attempts including the initial try
Min wait	1 second	Minimum backoff duration
Max wait	60 seconds	Maximum backoff duration
Multiplier	1	Exponential multiplier
Retryable errors	Connect, Timeout	Only transient errors are retried

Non-retryable errors (400 Bad Request, 403 Forbidden, 404 Not Found) fail immediately without retry.

Idempotency

Every request through the middleware is assigned an idempotency key:

# Generate idempotency key from payload
idempotency_key = hashlib.sha256(
    json.dumps(payload, sort_keys=True).encode()
).hexdigest()

How It Works

Before execution, the middleware checks if this idempotency_key has been seen before
If found in Redis (or in-memory cache), the stored result is returned immediately
If not found, the request proceeds and the result is stored with the key
Idempotency keys expire after a configurable TTL

Correlation IDs

Each request also gets a unique correlation_id for trace linkage:

correlation_id = str(uuid.uuid4())

This ensures that retries of the same logical request can be grouped in traces.

Dynamic Schema Validation

The middleware performs response validation using dynamically inferred schemas:

Schema Inference

async def _infer_schema(self, response: Dict) -> Any:
    """Dynamically create a Pydantic model from a response payload."""
    fields = {}
    for key, value in response.items():
        python_type = type(value)
        fields[key] = (python_type, ...)
    return create_model("DynamicResponse", **fields)

Validation

After receiving a response from the tool, the middleware:

Infers a schema from the response structure
Validates the response against the schema
If validation fails, logs a warning and triggers drift detection
Records the validation result in the trace

Middleware Pipeline

The full middleware pipeline for a single routing call:

1. Check circuit breaker → if OPEN, reject immediately
2. Generate idempotency key → if duplicate, return cached result
3. Generate correlation ID
4. Execute with retry wrapper
   a. Send request to tool
   b. On transient failure → retry with backoff
   c. On permanent failure → record failure, break
5. Validate response schema
6. Record circuit breaker result (success/failure)
7. Cache result with idempotency key
8. Record trace
9. Return result

TUI Trace Events

The middleware emits the following events to the TUI:

Event	When	Message
Circuit trip	Circuit breaker opens	`⚡ Circuit breaker OPENED for <dest>`
Retry attempt	Transient failure detected	`🔄 Retrying <dest> (attempt 2/3)`
Idempotency hit	Duplicate request detected	`📋 Idempotent result returned for <key>`
Schema validation	Response doesn't match schema	`⚠️ Schema mismatch in response from <tool>`
Recovery success	Probe succeeds in half-open	`✅ Circuit breaker CLOSED for <dest>`

Bridge Router

The bridge/router.py module provides the unified routing entrypoint that the reliability middleware wraps:

async def routeTo(target_protocol: str, payload: dict, config: dict) -> dict:
    """
    Unified routing entrypoint.
    The reliability middleware wraps this function.
    """
    ...

This function:

Determines the target connector based on target_protocol
Normalizes the payload through the semantic mapper
Dispatches to the appropriate connector's execute() method
Returns the result

Configuration

The reliability middleware uses sensible defaults but can be tuned:

Setting	Where	Default	Description
Circuit breaker threshold	`reliability/middleware.py`	5 failures	Consecutive failures before opening
Cooldown period	`reliability/middleware.py`	30 seconds	Time before half-open transition
Retry attempts	`reliability/middleware.py`	3	Maximum retry attempts
Backoff multiplier	`reliability/middleware.py`	1	Exponential backoff multiplier
Idempotency TTL	`reliability/middleware.py`	300 seconds	How long idempotency keys are cached

What's Next

Architecture — System-level view of where reliability fits
Observability & Tracing — Monitor reliability events
Self-Healing Engine — How schema validation feeds into healing

Swarm Memory

The bridge/memory.py module provides persistent semantic fact storage:

Layer	Technology	Purpose
Storage	SQLite	Durable fact persistence
Reasoning	SWI-Prolog (`pyswip`)	Semantic inference, conflict detection
Rules	pyDatalog	Declarative conflict resolution
Normalization	SemanticMapper	Ontology-backed concept alignment

Task Queue

SQL-backed task queue with lease-based processing:

Component	Configuration
Poll interval	`TASK_POLL_INTERVAL_SECONDS` (2.0s)
Lease duration	`TASK_LEASE_SECONDS` (60s)
Max attempts	`TASK_MAX_ATTEMPTS` (5)
Evolution tasks	Celery for async ML jobs

Workflow Scheduler

Periodic workflow execution:

Poll interval — WORKFLOW_SCHEDULER_POLL_SECONDS (5.0s)
Batch size — WORKFLOW_SCHEDULER_BATCH_SIZE (20)

Event System

Redis Streams-based event pipeline:

Stream key — engram:events
Consumer group — engram-event-workers
Fallback — Polling listener when Redis is unavailable
TUI integration — Events routed to trace panels via tui_bridge.py

Database Layer

Technology	Use Case
SQLAlchemy + SQLModel	ORM and schema definitions
asyncpg	Async PostgreSQL driver (production)
aiosqlite	Async SQLite driver (local dev)
Alembic	Schema migrations

Smart Fallback

The _finalize_database_url validator in Settings automatically detects the runtime environment and switches between PostgreSQL and SQLite.

Background Services

All auto-started via FastAPI lifespan:

Service	Purpose
Discovery Service	Periodic agent and tool health checks
Task Worker	Polls and processes queued tasks
Workflow Scheduler	Triggers scheduled workflows
Event Listener	Processes Redis Stream events

Security Architecture

Layer	Mechanism
Authentication	JWT validation (HS256/RS256)
Authorization	EAT semantic scopes from `security.owl`
Fail-closed	Security checks deny when infrastructure is down
Credential encryption	Fernet symmetric encryption
Transport security	HTTPS redirect, security headers, CORS
Rate limiting	Per-IP with `slowapi`

What's Next

Reliability Middleware — Deep dive into the reliability layer
Contributing — Development setup and contribution guidelines
SDK & Python Library — Programmatic integration

What's Next

Self-Healing Engine — How healing feeds into evolution
Observability & Tracing — Monitor evolution outcomes
Configuration — Configure ML thresholds

Contributing

This guide helps you get set up for contributing to Engram — from development environment to code standards to submitting your first pull request.

Development Environment Setup

1. Fork and Clone

git clone https://github.com/<your-username>/engram_translator.git
cd engram_translator

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate   # Linux/macOS
# or: .\venv\Scripts\activate   # Windows

3. Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt
pip install -r requirements-dev.txt  # Includes test + linting tools

4. Set Up Pre-Commit Hooks

pre-commit install

5. Initialize Configuration

mkdir -p ~/.engram
python -m app.cli init

6. Start the Backend

uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

7. Verify

curl http://localhost:8000/health
python -m app.cli info

Project Structure

engram_translator/
├── app/                          # Core application
│   ├── api/v1/                   # API routers
│   │   ├── auth.py              # Authentication endpoints
│   │   ├── discovery.py         # Agent/tool discovery
│   │   ├── endpoints.py         # General API endpoints
│   │   ├── events.py            # Event listeners and sync
│   │   ├── evolution.py         # Self-evolving tools
│   │   ├── federation.py        # Protocol translation
│   │   ├── orchestration.py     # Task orchestration
│   │   ├── reconciliation.py    # Self-healing
│   │   ├── registry.py          # Tool registration
│   │   ├── routing.py           # Routing tests/stats
│   │   └── tracing.py           # Execution traces
│   ├── cli.py                   # CLI entrypoint (Typer + Rich)
│   ├── core/
│   │   ├── config.py            # Settings model (Pydantic)
│   │   ├── security.py          # JWT validation, EAT verification
│   │   └── tui_bridge.py        # TUI event bridge
│   ├── db/
│   │   └── session.py           # Database engine and session
│   ├── main.py                  # FastAPI application
│   ├── models/                  # SQLModel/Pydantic models
│   ├── semantic/
│   │   ├── protocols.owl        # Protocol ontology
│   │   ├── security.owl         # Security ontology
│   │   ├── semantic_mapper.py   # Semantic field mapper
│   │   └── models/              # ML models (joblib)
│   └── services/
│       ├── credentials.py       # Credential encryption
│       ├── eat_identity.py      # EAT token lifecycle
│       ├── orchestrator.py      # Protocol orchestration
│       └── tool_routing.py      # Routing engine
├── bridge/
│   ├── memory.py                # Swarm Memory (SQLite + Prolog)
│   └── router.py                # Unified routing entrypoint
├── delegation/
│   └── engine.py                # Agent delegation engine
├── engram_sdk/
│   ├── auth.py                  # SDK authentication
│   ├── client.py                # SDK client
│   └── transport.py             # HTTP transport layer
├── reliability/
│   └── middleware.py            # Circuit breaker, retry, idempotency
├── tui/
│   ├── app.py                   # Textual TUI application
│   └── vault_service.py         # TUI credential vault
├── trading-templates/           # Trading integration templates
├── monitoring/
│   ├── grafana/                 # Grafana dashboards
│   ├── k8s/                     # Kubernetes manifests
│   └── prometheus.yml           # Prometheus config
├── docs/                        # Documentation (you are here)
├── tests/                       # Test suite
├── alembic/                     # Database migrations
├── docker-compose.yml           # Dev Docker Compose
├── docker-compose.staging.yml   # Staging Docker Compose
├── requirements.txt             # Python dependencies
├── setup.sh                     # Unix installer
├── engram                       # Unix self-healing entrypoint
└── engram.bat                   # Windows self-healing entrypoint

Code Style

Python

Formatter — Black (line length: 120)
Linter — Ruff (replaces flake8 + isort)
Type checking — Pyright (strict mode)
Import order — stdlib → third-party → local

Key Conventions

Convention	Standard
Naming	`snake_case` for functions/variables, `PascalCase` for classes
Type hints	Required on all function signatures
Docstrings	Google style for all public functions/classes
Error handling	Typed exceptions, never bare `except:`
Async	Use `async def` for all I/O operations
Database	SQLModel for models, async sessions for queries
CLI	Typer for commands, Rich for output formatting

Testing

Run Tests

# All tests
pytest

# With coverage
pytest --cov=app --cov-report=html

# Specific module
pytest tests/test_routing.py

# Verbose with output
pytest -v -s

Test Structure

tests/
├── test_auth.py          # Authentication flows
├── test_registry.py      # Tool registration
├── test_routing.py       # Routing engine
├── test_healing.py       # Self-healing
├── test_federation.py    # Protocol translation
├── test_reliability.py   # Circuit breaker, retry
├── test_sdk.py           # SDK client
└── conftest.py           # Shared fixtures

Writing Tests

import pytest
from httpx import AsyncClient
from app.main import app

@pytest.mark.asyncio
async def test_tool_registration():
    async with AsyncClient(app=app, base_url="http://test") as client:
        response = await client.post(
            "/api/v1/registry/manual",
            json={
                "name": "Test Tool",
                "description": "A test tool",
                "base_url": "https://api.test.com",
                "path": "/v1/test",
                "method": "GET",
                "parameters": []
            },
            headers={"Authorization": "Bearer test-token"}
        )
        assert response.status_code == 200
        assert response.json()["name"] == "Test Tool"

Branching Strategy

Branch	Purpose
`main`	Stable release branch
`develop`	Integration branch for features
`feature/<name>`	New features
`fix/<name>`	Bug fixes
`docs/<name>`	Documentation updates

Workflow

Create a feature branch from develop
Make your changes
Write/update tests
Run the full test suite
Open a PR against develop
Address review feedback
Squash and merge

Pull Request Guidelines

PR Title

Use conventional commit format:

feat(routing): add predictive optimization
fix(auth): handle expired refresh tokens
docs(quickstart): update installation steps
refactor(semantic): extract BidirectionalNormalizer

PR Description

Include:

What — Brief description of the change
Why — Motivation and context
How — Technical approach
Testing — How you verified the change
Breaking changes — If any

Review Checklist

[ ] Tests pass (pytest)
[ ] Linting passes (ruff check .)
[ ] Type checking passes (pyright)
[ ] Documentation updated (if user-facing)
[ ] Database migrations included (if schema changed)
[ ] No secrets or credentials in code

Adding New Features

Adding a New API Router

Create app/api/v1/my_feature.py
Define endpoints with APIRouter(prefix="/api/v1/my-feature")
Register in app/main.py
Add tests in tests/test_my_feature.py
Update documentation

Adding a New CLI Command

Add command group in app/cli.py using typer.Typer()
Implement the command function
Add Rich formatting for output
Support --json output mode
Add to the REPL help table
Update CLI Reference documentation

Adding a New Protocol Connector

Implement the connector interface (see existing connectors)
Register in the orchestrator's connector registry
Add ontology concepts for the new protocol's fields
Write federation tests
Update Protocol Federation documentation

Adding a New Provider

Add API key setting in app/core/config.py
Create a TUI connection screen in tui/app.py
Add credential type in app/services/credentials.py
Register the provider in the backend provider list
Write integration tests

Ontology Contributions

To extend the semantic layer:

Adding Concepts to `protocols.owl`

<owl:Class rdf:about="#MyNewConcept">
  <rdfs:subClassOf rdf:resource="#ParentConcept"/>
  <rdfs:label>My New Concept</rdfs:label>
</owl:Class>

Adding Equivalences

<owl:AnnotationProperty rdf:about="#semanticEquivalent"/>
<owl:NamedIndividual rdf:about="#field_name_a">
  <semanticEquivalent rdf:resource="#field_name_b"/>
</owl:NamedIndividual>

Testing Ontology Changes

Load the modified ontology with rdflib
Query for the new concepts
Verify equivalence resolution
Run the full healing test suite

Release Process

Update version in pyproject.toml
Update CHANGELOG.md
Create a release branch from develop
Open PR to main
After merge, tag the release: git tag v1.x.x
Build and push Docker image
Deploy to staging, then production

Getting Help

Issues — Open a GitHub issue for bugs or feature requests
Discussions — Use GitHub Discussions for questions
Documentation — Check the docs/ directory
Code — Read the source — it's well-documented with docstrings and type hints

What's Next

Architecture — Understand the codebase structure
Installation — Set up your development environment
CLI Reference — Learn the command structure

Quickstart

1. Install Engram

2. Start the Gateway

3. Register Your First Tool

4. Verify Your Tools

5. Test Intelligent Routing

6. Try Key Features

Check self-healing status

Inspect execution traces

View your identity

Translate between protocols

Quick Reference

Next Steps

Installation

Quick Install

Linux / macOS / WSL2

Windows

What the Installer Does

After Installation

Prerequisites

Manual Installation

Step 1: Clone the Repository

Step 2: Create Virtual Environment

Step 3: Install Python Dependencies

Step 4: Initialize Configuration

Step 5: Add API Keys

Step 6: Start the Gateway

Step 7: Verify the Installation

Quick-Reference: Manual Install (Condensed)

Self-Healing Entry Points

Runtime Environment Detection

Troubleshooting

What's Next

Docker & Kubernetes Setup

Deployment Options

Docker Compose Quick Start

Verify the Stack

Environment Variables

Example .env for Production

Staging Configuration

Kubernetes Deployment

Apply All Manifests

Architecture

Secrets

Health Checks

Horizontal Pod Autoscaler

Monitoring Stack

Access Grafana

Prometheus Configuration

Alerting

Persistent State

Troubleshooting

What's Next

Updating & Uninstalling

Updating

Standard Update (Git + pip)

What Happens During an Update

Post-Update Validation

Rollback

Rollback Strategies

Updating Docker Deployments

Kubernetes

Uninstalling

Remove the CLI

Remove the Codebase

Remove Configuration and Data (Optional)

Remove the Background Service

Remove Docker Resources

Remove the Keyring Entry

Complete Uninstall Checklist

What's Next

CLI Reference

Running the CLI

REPL Built-in Commands

Debug TUI Mode

Global Options

JSON Output Mode

Core Commands

engram init

engram info

Example `.env` for Production

`engram init`

`engram info`

`engram run`

`engram auth login`

`engram auth whoami`

`engram auth scope`

`engram auth status`

`engram auth token-set`

`engram config show`

`engram config set`

`engram tools list`

`engram tools search`

`engram register openapi`

`engram register command`

`engram register tool`

`engram route test`

`engram route list`

`engram heal status`

`engram heal now`

`engram trace list`

`engram trace detail`

`engram evolve status`

`engram evolve apply`

`engram protocol translate`

`engram protocol handoff simulate`

`engram sync list`

`engram sync add`

`engram sync status`