Architecture
This page explains Hoard’s technical architecture and design decisions.
High-Level Overview
┌─────────────────────────────────────────────────────────┐│ HOARD CORE ││ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────────┐ ││ │ Index │ │ Search │ │ Memory │ │ Sync │ │ Orchestrator│ ││ └─────────┘ └─────────┘ └─────────┘ └──────────┘ └────────────┘ │└─────────────────────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────────────────────┐│ HTTP MCP Server (default) ││ http://127.0.0.1:19850/mcp ││ Auth: Bearer $HOARD_TOKEN │└─────────────────────────────────────────────────────────┘ │ │ │ ▼ ▼ ▼ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ Claude Code│ │ Codex │ │ OpenClaw │ └────────────┘ └────────────┘ └────────────┘Technology Stack
| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| Database | SQLite + FTS5 |
| Vectors | SQLite brute-force (built-in) |
| Embeddings | sentence-transformers |
| CLI | Click + Rich |
| HTTP Server | Python http.server |
Project Structure
hoard/├── core/│ ├── db/ # Database operations│ ├── ingest/ # Content processing│ ├── search/ # Search engine│ ├── orchestrator/ # Agents, tasks, workflows│ ├── mcp/ # MCP protocol handling│ └── security/ # Auth & rate limiting│├── sdk/│ ├── base.py # ConnectorV1 interface│ ├── types.py # EntityInput, ChunkInput│ ├── chunking.py # Chunking utilities│ └── hash.py # Content hashing│├── connectors/ # Core + community connectors│ ├── inbox/│ ├── local_files/│ ├── obsidian/│ ├── bookmarks_chrome/│ ├── bookmarks_firefox/│ └── notion_export/│├── cli/ # CLI commands├── tests/└── docs/Core Components
Database Layer
SQLite with FTS5 (Full-Text Search) provides:
- ACID transactions for data integrity
- Full-text search via FTS5 triggers
- JSON support for metadata storage
- Zero configuration — just a file
Search Engine
Hybrid search combining:
- BM25 — Traditional keyword matching via FTS5
- Vector Search — Semantic similarity (optional)
- Reciprocal Rank Fusion — Merges both rankings
Orchestrator (Beta)
Multi-agent coordination layer:
- Agents register and advertise capabilities
- Tasks are pulled and claimed by agents
- Workflows chain tasks into multi-step automation
- Artifacts capture outputs and events track state changes
- Cost ledger tracks usage and budgets
Sync Engine
On-demand sync via hoard sync, plus optional background schedule and file watcher when the server is running:
- Full scan — Re-scans all configured connectors
- Change detection — Content hash comparison
- Tombstoning — Marks deleted content as tombstoned
Background sync runs every sync.interval_minutes, and the watcher triggers incremental syncs on change. A lock file prevents concurrent syncs.
MCP Server
HTTP-based Model Context Protocol server:
- JSON-RPC 2.0 over HTTP
- Bearer token auth
- Rate limiting per tool and per token
- Audit logging for all requests
Design Principles
Local-First
All data stays on your machine:
- No cloud sync
- No external dependencies for basic operation
- Works offline
Pre-Indexing
Data is indexed before agents need it:
- Fast search responses
- Consistent results
- No real-time API calls during queries
Chunk-Level Retrieval
Documents split into semantic chunks:
- Precise citations
- Stable chunk IDs
- Optimal context for LLMs
Multi-Agent Support
One data layer for all agents:
- Same interface for Claude Code, Codex, OpenClaw
- Consistent memory across agents
- No re-indexing when switching tools
Transport Options
HTTP (Default)
hoard serveBenefits:
- Survives client restarts
- Can be daemonized
- Works with all MCP clients
Stdio (Alternative)
hoard mcp stdioFor MCP clients that require stdio transport.
Security Architecture
┌─────────────────────────────────────────┐│ MCP Request │└─────────────────┬───────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Token Validation ││ • Verify Bearer token ││ • Check token scopes │└─────────────────┬───────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Rate Limiting ││ • Per-token limits ││ • Per-tool limits ││ • Byte/chunk caps │└─────────────────┬───────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Execute Tool ││ • Sensitivity filtering ││ • Audit logging │└─────────────────────────────────────────┘Connector Architecture
Connectors are plugins that implement ConnectorV1:
Hoard Core │ │ load connector ▼┌─────────────────┐│ ConnectorV1 ││ ├── discover() │ → Validate config│ ├── scan() │ → Yield entities│ └── cleanup() │ → Release resources└─────────────────┘ │ │ EntityInput, ChunkInput ▼Core Ingest PipelineNext Steps
- Data Model — Entity and chunk schemas
- Search — How hybrid search works
- Security — Security model details