The manual.
recon is a Rust MCP server that gives AI coding agents symbol-aware tools
in place of Read, Grep, and Glob. This
document explains what it is, how to run it, and the ideas that shape its API.
01Introduction
Coding agents spend a surprising fraction of their context window just looking at code — reading whole files to find a single function, grepping a repo for a name that a language server already knows. recon treats that as a workflow bug, not a fact of life.
recon is a hosted MCP server. Your agent connects over MCP; indexing, watching, ranking, and search all run server-side. There is no local binary to install, no build step, no watcher process on your machine.
Under the hood it is tree-sitter for structure, SQLite + FTS5 for exact and fuzzy symbol lookup, Tantivy with a camelCase/snake_case tokenizer for structured BM25, and fff-grep for SIMD-accelerated raw text. On top of that sits an MCP server exposing twelve tools, each returning one of five canonical output shapes.
If a response can't be mapped to one of the five shapes, that's a signal the tool is doing too much. Split it, don't widen the schema.
02Connecting
recon is a hosted MCP server. You don't install anything — you connect your agent to it.
# 1. create a workspace key at recon.dev RECON_KEY=rk_live_9f2c… # 2. register the repo you want indexed POST https://api.recon.dev/v1/workspaces/acme/repos { "remote": "git@…:acme/web.git", "ref": "main" }
Quickstart
Point your agent at recon over MCP. Any MCP-compatible client works — Claude Code, Cursor, custom agents. Drop this into your agent's MCP config:
{
"mcpServers": {
"recon": {
"url": "https://mcp.recon.dev/v1",
"headers": {
"Authorization": "Bearer $RECON_KEY"
}
}
}
}
Then nudge the agent to prefer symbol-aware tools. Add this to its system prompt or agent rules:
# agent rules Use code_* tools (code_outline, code_skeleton, code_find_symbol, code_search, code_repo_map) before Read/Grep/Glob when exploring code. They return structured, token-efficient results.
03Workspace configuration
Per-workspace settings live at recon.dev/settings, or via the management API. These apply server-side; your agent never sees them.
# Additional ignore patterns (on top of .gitignore) ignore_patterns = ["*.generated.*", "dist/"] # Max file size to index (default 2 MB) max_file_size = 2097152 # Max search results per tool call (default 30) max_search_results = 30 # Token budget for code_repo_map (default 2000) default_map_budget = 2000 # Secret redaction — keep on unless you really mean it redact_secrets = true allow_sensitive = false
04Output shapes
Every tool response is exactly one of five shapes. This keeps the agent's parser simple and the token budget honest.
| Shape | Returned by | Shape of the thing |
|---|---|---|
Outline | code_outline, code_list | Flat rows — kind, name, line. |
Skeleton | code_skeleton | Signatures and doc comments; bodies collapse to …. |
Symbol | code_read_symbol | One symbol's full source plus its callers. |
Matches | code_search, code_find_* | Ranked hits with path, line, and a trimmed snippet. |
Map | code_repo_map | PageRank-ordered symbol overview under a token budget. |
05Search tiers
code_find_symbol and code_search fan out across five backends, picking the cheapest tier that can answer the question.
SQLite btree
Direct index hit on the symbol name. Used for known identifiers.
< 1 msFTS5 + nucleo
Trigram prefilter, nucleo rescore. Survives typos and partial names.
2 – 8 msTantivy · CodeTokenizer
Splits camelCase and snake_case so renderMap matches render_map.
fff-grep
SIMD + memmap2 fallback when the query escapes the symbol table.
3 – 20 msfastembed + LanceDB
Local ONNX embeddings. Feature-gated behind --features embed.
Reciprocal rank fusion
Tantivy and text backends interleaved via RRF for mixed-intent queries.
varies06Incremental indexing
recon is aggressive about never reparsing a file it doesn't need to. Four layers:
- ColdStart.
gixreads the HEAD SHA. If it matches the last indexed commit, the whole pass is skipped. - Merkle diff. A
blake3hash tree of file contents is snapshotted per index. On HEAD change, only the diff is reparsed. - Full index. First run or missing snapshot falls back to a rayon-parallel parse with chunked bulk SQLite inserts (500 files/tx). Zed (80K symbols): 19s. Rust compiler (320K symbols): 53s.
- Live watcher.
notify-debouncer-fullflushes per-file reindex after a 250 ms quiet window. On overflow, falls back togix status.
07Filter DSL
All search tools accept an optional filter argument. The grammar is the same one used across the fff- family:
| Example | Effect |
|---|---|
*.rs | Extension filter. |
type:rust | Language match (independent of extension). |
status:modified | Only git-modified files. |
!test | Exclude paths containing test. |
/src/ | Path-segment match. |
*.rs status:modified !test | Composes with implicit AND. |
08The 12 tools
Every tool description is under 2 KB (Claude Code truncates above this) and maps to exactly one output shape.
| Tool | Replaces | Shape | What it does |
|---|---|---|---|
code_outline | Read | Outline | One line per symbol in a file. |
code_skeleton | Read | Skeleton | Signatures + docs; bodies as …. |
code_read_symbol | Read | Symbol | Single symbol + its callers. |
code_find_symbol | Grep | Matches | Three-tier exact → BM25 → fuzzy. |
code_find_refs | Grep | Matches | Ref count + top-k call sites. |
code_search | Grep | Matches | Exact, regex, hybrid, semantic. |
code_list | Glob | Outline | Files with their top symbols. |
code_repo_map | — | Map | PageRank overview under a budget. |
code_find_strings | — | Matches | Literals and comments only. |
code_multi_find | — | Matches | Fan-out multi-pattern search. |
code_reindex | — | — | Agent-triggered reindex. |
code_stats | — | — | Index health report. |
Example: code_read_symbol
// request { "tool": "code_read_symbol", "path": "crates/recon-search/src/map.rs", "symbol": "render_map" } // response · Symbol shape { "kind": "fn", "name": "render_map", "signature": "pub fn render_map(budget: usize) -> Map", "doc": "Render a tiktoken-budgeted repo overview.", "body": "…", "callers": [ { "path": "crates/recon-server/src/tools/repo_map.rs", "line": 84 }, { "path": "crates/recon-cli/src/commands/map.rs", "line": 37 } ] }
09Security
- No cloud embeddings by default. The
embedfeature is opt-in and runs ONNX locally. - Secret redaction runs on every response: 13 regex patterns, PEM blocks, and blocked paths.
- Path traversal guard. Every tool call canonicalizes and prefix-checks against the repo root.
- Sensitive file gate.
.env,.pem,.keyare skipped unlessallow_sensitive = true. - Stdout hygiene. All logging goes to stderr; a CI test spawns the server and rejects any stray
println!.
10Performance
Benchmarked on real codebases. Release build, warm cache, mimalloc, fat LTO.
Query latency
| Tool | Zed (80K sym) | Rust compiler (320K sym) |
|---|---|---|
| stats | 10 ms | 29 ms |
| find | 8 ms | 8 ms |
| search | 11 ms | 33 ms |
| outline | 14 ms | 13 ms |
| skeleton | 11 ms | 11 ms |
| refs | 8 ms | 12 ms |
| map (cached) | 8 ms | 19 ms |
Indexing
| Repo | Files | Symbols | Cold index | Index size |
|---|---|---|---|---|
| Zed (1.3M LOC) | 1,780 | 80,427 | 19 s | 96 MB |
| Rust compiler (3.8M LOC) | 36,030 | 320,161 | 53 s | 291 MB |
Token reduction
| Scenario | Read/Grep/Glob | recon | Reduction |
|---|---|---|---|
| Read one function | ~23,838 tok | ~111 tok | 215× |
| Find a symbol | ~17,500 tok | ~226 tok | 77× |
| Repo orientation | ~52,500 tok | ~2,170 tok | 24× |
| Find references | ~15,000 tok | ~638 tok | 24× |
| Typical bug-fix task | ~109,000 tok | ~3,100 tok | 35× |
System
| Metric | Value |
|---|---|
| Binary size (release, stripped) | 24 MB |
| Languages | 9 (Rust, Python, TS, TSX, JS, Go, Java, C, C++) |
| Tests | 107 across 19 suites |
| Allocator | mimalloc |
| Mutex | parking_lot (not tokio::sync) |
| Hash maps | ahash::AHashMap in hot paths |
11ADRs
- 000 — Symbol-first architecture.
- 001 — Text-search backend (fff-grep vs. ripgrep, why fff).
- 002 — Output-shape discipline (the five canonical shapes).
- 003 — Stdio transport hygiene (stderr-only logging, CI guard).
12Changelog
v0.1.0 — Initial release.
- Hosted MCP server with API key auth and multi-tenant DashMap routing
- Streamable HTTP transport (MCP spec 2025-11-25) + stdio
- 12 tools, 9 languages, 5 output shapes
- 3-tier symbol search: exact SQLite → Tantivy BM25 → FTS5 trigram + nucleo fuzzy
- Tantivy-first text search with grep fallback
- PageRank repo map with early convergence, cached in SQLite
- Merkle incremental indexing with chunked bulk inserts (500 files/tx)
- Secret redaction on all code-returning tool responses
- mimalloc, parking_lot::Mutex, ahash::AHashMap, fat LTO
- 107 tests, zero unwrap in production code, #[deny(missing_docs)] on all crates
rr— public CLI client for remote servers- Tested on Zed (1.3M LOC) and Rust compiler (3.8M LOC)
13License
recon is proprietary software. © 2026. All rights reserved. Use is governed by the recon Terms of Service.