Recall v2 Scope Investigation¶
URL: https://mkdocs.justinsforge.com/memory/investigations/recall-v2-scope-2026-05-04/
Date: 2026-05-04 Author: [Claude Code] Source handoff: recall-v2-three-layer-retrieval-2026-05-03
Current State¶
Implementation is the v1 single-shot design described in the handoff. Nothing from the v2 three-layer plan has landed.
| Component | Path | LOC | Status |
|---|---|---|---|
| Indexer | forge/scripts/forge_search_index.py |
263 | v1 |
| Query CLI | forge/scripts/forge_search_query.py |
102 | v1 |
| Skill | forge/.claude/skills/recall/SKILL.md |
n/a | v1, supports <query>, -k, --json only |
| Index DB | forge/data/search/index.db |
n/a | sqlite + sqlite-vec, BGE-small 384-dim |
Reference doc memory/general/reference_semantic_search.md cites stale paths (scripts/search/index.py, scripts/search/query.py); actual files are forge_search_*.py flat in scripts/. Reference doc is 10 days old and needs an update either way.
Index DB schema (verified live)¶
files(path PK, hash, mtime, indexed_at)
chunks(id PK AUTOINCREMENT, path FK, heading, chunk_idx, body)
chunk_vecs vec0(embedding float[384]) -- sqlite-vec virtual
No chunk_id (stable opaque), no summary column. Chunks are addressed by autoincrement id + (path, chunk_idx) only.
Gap Analysis vs v2 Design¶
| v2 Requirement | Present? | Notes |
|---|---|---|
| Stable opaque chunk IDs (4-hex) | NO | Need new chunk_id column + sha1(path+chunk_idx)[:4] with collision check |
Per-chunk summary column |
NO | Add column; cheap version = first 80 chars to word boundary |
--mode index compact output |
NO | Default today is full chunk body |
--mode get --ids ... |
NO | Net new |
--mode timeline --id --window N |
NO | Net new; needs ordering by chunk_idx within file (already there) and date for cross-file (mtime / parsed date in path) |
--full escape hatch |
NO | Equivalent to today's default; trivial flag |
FORGE_RECALL_V2=1 gate |
NO | Net new |
| Schema version bump + auto-rebuild on upgrade | NO | Indexer has no schema_version table today |
Token-count bench harness data/recall-bench/ |
NO | Net new |
Bottom line: zero v2 work has been started. v1 is healthy and correct, just expensive in tokens.
Riskiest Unknown¶
Timeline semantics across files. Layer-2 timeline is well-defined inside one file (chunk_idx ordering). Across files "5 entries before/after" requires a global temporal axis. Options:
- Use file mtime, fast but bad for handoffs/daily logs whose semantic date is in the filename, not mtime.
- Parse YYYY-MM-DD from filename (handoffs, daily, investigations all carry it), fall back to mtime.
- Per-file timeline only (collapse cross-file ambiguity by scoping window to same file).
Handoff doc reads as if (3) is intended ("from the same source file"), which sidesteps the risk. Confirm with Justin before coding; if cross-file timeline is wanted, option 2 is the right call but doubles indexer work. Default assumption: same-file only, matches handoff.
Secondary risk: full re-index time. Reference doc claims 2-3 min for ~100 files / 1500 chunks; handoff claims ~10 min. Real corpus is bigger now. Worth measuring before committing to a "rebuild on first run" deploy story; if it's >5 min, gate behind explicit --rebuild instead of auto.
Build Sequence¶
- Schema migration in indexer (~45 min)
- Add
chunk_id TEXT UNIQUE,summary TEXTtochunks. - Add
meta(key PK, value)table; writeschema_version=2. - On startup, if missing or v1 → drop
chunks+chunk_vecs, full re-embed. - Indexer chunk-ID + summary generation (~30 min)
chunk_id = sha1(f"{path}:{chunk_idx}").hexdigest()[:4]; on collision extend to 5, 6, ... chars.summary = body[:80]truncated at last whitespace, ellipsized.- Query CLI three-mode refactor (~60 min)
- Default mode: compact lines
[id] path date | summary score=0.NN. --get a3f2,b8c1: lookup by chunk_id, print full body (current default behavior).--timeline <id> --window N: select same-path chunks ordered by chunk_idx, return id ± N.--full: alias for current default behavior on a query.- Keep
--json,-k. - Skill rewrite (~20 min)
- Update
forge/.claude/skills/recall/SKILL.mdwith flag set + decision flowchart (when to use each layer). - Bench harness (~30 min)
scripts/forge_recall_bench.pyruns N canned queries v1 vs v2, logs token counts todata/recall-bench/<date>.jsonl.- Docs + reference update (~15 min)
- Refresh
memory/general/reference_semantic_search.md(also fix the stalescripts/search/...paths while there). - Dogfood week behind
FORGE_RECALL_V2=1, then flip default.
Effort Estimate¶
| Phase | Hours |
|---|---|
| Schema + indexer | 1.25 |
| CLI refactor | 1.0 |
| Skill rewrite | 0.33 |
| Bench harness | 0.5 |
| Reference doc fix | 0.25 |
| Total active | ~3.3 |
Aligns with handoff's ~3h estimate. Dogfood week is passive.
Recommendation¶
Run /feature-plan recall v2 three-layer retrieval per handoff pickup checklist before writing code, primarily to lock the timeline-scope decision (same-file only vs cross-file) and confirm the cheap summary is the v2.0 target (Sonnet summaries deferred). Everything else is mechanical.