Skip to content

Architecture

This page explains Hoard’s technical architecture and design decisions.

High-Level Overview

┌─────────────────────────────────────────────────────────┐
│ HOARD CORE │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌────────────┐ │
│ │ Index │ │ Search │ │ Memory │ │ Sync │ │ Orchestrator│ │
│ └─────────┘ └─────────┘ └─────────┘ └──────────┘ └────────────┘ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ HTTP MCP Server (default) │
│ http://127.0.0.1:19850/mcp │
│ Auth: Bearer $HOARD_TOKEN │
└─────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Claude Code│ │ Codex │ │ OpenClaw │
└────────────┘ └────────────┘ └────────────┘

Technology Stack

ComponentTechnology
LanguagePython 3.11+
DatabaseSQLite + FTS5
VectorsSQLite brute-force (built-in)
Embeddingssentence-transformers
CLIClick + Rich
HTTP ServerPython http.server

Project Structure

hoard/
├── core/
│ ├── db/ # Database operations
│ ├── ingest/ # Content processing
│ ├── search/ # Search engine
│ ├── orchestrator/ # Agents, tasks, workflows
│ ├── mcp/ # MCP protocol handling
│ └── security/ # Auth & rate limiting
├── sdk/
│ ├── base.py # ConnectorV1 interface
│ ├── types.py # EntityInput, ChunkInput
│ ├── chunking.py # Chunking utilities
│ └── hash.py # Content hashing
├── connectors/ # Core + community connectors
│ ├── inbox/
│ ├── local_files/
│ ├── obsidian/
│ ├── bookmarks_chrome/
│ ├── bookmarks_firefox/
│ └── notion_export/
├── cli/ # CLI commands
├── tests/
└── docs/

Core Components

Database Layer

SQLite with FTS5 (Full-Text Search) provides:

  • ACID transactions for data integrity
  • Full-text search via FTS5 triggers
  • JSON support for metadata storage
  • Zero configuration — just a file

Search Engine

Hybrid search combining:

  1. BM25 — Traditional keyword matching via FTS5
  2. Vector Search — Semantic similarity (optional)
  3. Reciprocal Rank Fusion — Merges both rankings

Orchestrator (Beta)

Multi-agent coordination layer:

  • Agents register and advertise capabilities
  • Tasks are pulled and claimed by agents
  • Workflows chain tasks into multi-step automation
  • Artifacts capture outputs and events track state changes
  • Cost ledger tracks usage and budgets

Sync Engine

On-demand sync via hoard sync, plus optional background schedule and file watcher when the server is running:

  • Full scan — Re-scans all configured connectors
  • Change detection — Content hash comparison
  • Tombstoning — Marks deleted content as tombstoned

Background sync runs every sync.interval_minutes, and the watcher triggers incremental syncs on change. A lock file prevents concurrent syncs.

MCP Server

HTTP-based Model Context Protocol server:

  • JSON-RPC 2.0 over HTTP
  • Bearer token auth
  • Rate limiting per tool and per token
  • Audit logging for all requests

Design Principles

Local-First

All data stays on your machine:

  • No cloud sync
  • No external dependencies for basic operation
  • Works offline

Pre-Indexing

Data is indexed before agents need it:

  • Fast search responses
  • Consistent results
  • No real-time API calls during queries

Chunk-Level Retrieval

Documents split into semantic chunks:

  • Precise citations
  • Stable chunk IDs
  • Optimal context for LLMs

Multi-Agent Support

One data layer for all agents:

  • Same interface for Claude Code, Codex, OpenClaw
  • Consistent memory across agents
  • No re-indexing when switching tools

Transport Options

HTTP (Default)

19850/mcp
hoard serve

Benefits:

  • Survives client restarts
  • Can be daemonized
  • Works with all MCP clients

Stdio (Alternative)

Terminal window
hoard mcp stdio

For MCP clients that require stdio transport.

Security Architecture

┌─────────────────────────────────────────┐
│ MCP Request │
└─────────────────┬───────────────────────┘
┌─────────────────────────────────────────┐
│ Token Validation │
│ • Verify Bearer token │
│ • Check token scopes │
└─────────────────┬───────────────────────┘
┌─────────────────────────────────────────┐
│ Rate Limiting │
│ • Per-token limits │
│ • Per-tool limits │
│ • Byte/chunk caps │
└─────────────────┬───────────────────────┘
┌─────────────────────────────────────────┐
│ Execute Tool │
│ • Sensitivity filtering │
│ • Audit logging │
└─────────────────────────────────────────┘

Connector Architecture

Connectors are plugins that implement ConnectorV1:

Hoard Core
│ load connector
┌─────────────────┐
│ ConnectorV1 │
│ ├── discover() │ → Validate config
│ ├── scan() │ → Yield entities
│ └── cleanup() │ → Release resources
└─────────────────┘
│ EntityInput, ChunkInput
Core Ingest Pipeline

Next Steps