Research: Superpowers vs Context Mode vs ClaudeMem vs forge stack¶
URL: https://mkdocs.justinsforge.com/memory/research/superpowers-context-mode-claudemem-vs-forge-stack-2026-05-03/
Date: 2026-05-03 Depth: deep Model: sonnet
TL;DR¶
- Superpowers is a methodology plugin, not an infra tool. Its brainstorm-plan-subagent-TDD pipeline duplicates patterns Justin already enforces via doctrine (robust-over-quick, /spawn, eval harness) but adds enforcement gates and git-worktree discipline that forge currently lacks for greenfield feature work. Steal-design-only.
- Context Mode is genuinely new capability. Forge has no tool-output compression layer. Hook-intercepted sandbox execution that prevents raw git log / WebFetch / large file reads from inflating context is a net-new win for long coordinator-bot dev sessions. Install.
- ClaudeMem (thedotmack/claude-mem, 71k stars, npm:claude-mem v12.5) is the canonical tool. It runs a persistent background Bun worker, Chroma vector DB, and 5 lifecycle hooks to capture, compress, and re-inject every session observation. Forge already has a better-designed equivalent (auto-memory + auto-dream + /recall). The 3-layer retrieval API and "search index first, details by ID" pattern are design pieces worth stealing.
- No tool conflicts with forge security rules or causes data-exfiltration risk; all three are local-first.
Findings¶
1. Superpowers¶
Repo: obra/superpowers [1], on Anthropic official plugin marketplace and obra/superpowers-marketplace [2]. Stars: 177k [1]. Version: 5.0.7. Install:
or/plugin marketplace add obra/superpowers-marketplace
/plugin install superpowers@superpowers-marketplace
Mechanism. Superpowers is a pure-skills plugin with a single SessionStart hook. The hook injects the using-superpowers skill content (~5k bytes) as <EXTREMELY_IMPORTANT> context and registers a Skill tool that lazy-loads any of 14 SKILL.md files on demand [3]. Skills are not loaded at startup; only the using-superpowers bootstrap text is [3]. The model then decides which skill to invoke based on the bootstrap's decision flowchart.
The core workflow is:
1. brainstorming skill: socratic design loop, blocks all code until user approves a spec saved to docs/superpowers/specs/ [4].
2. writing-plans skill: breaks spec into 2-5 min bite-sized tasks with exact file paths, TDD steps, commit points [5].
3. subagent-driven-development or executing-plans skill: dispatches fresh subagents per task with two-stage review (spec compliance, then code quality) [6].
4. test-driven-development skill: RED-GREEN-REFACTOR cycle, hard gate against writing production code before a failing test exists [7].
5. verification-before-completion, requesting-code-review, finishing-a-development-branch wrap the cycle.
Token cost reality. Only the using-superpowers bootstrap (~5.4k bytes, ~1.4k tokens) is injected at SessionStart [3]. All 14 skills total ~108k bytes (~27-36k tokens), but they are lazy-loaded by the Skill tool call only when triggered. In practice: brainstorm-only sessions incur ~2 additional skill reads; full feature cycles (brainstorm + plan + subagent + TDD) accumulate 5-8 lazy reads totaling 15-25k tokens of additional context, spread across the life of the session.
What it enforces that forge doctrine does not have wired-in. Superpowers has hard-gate blocks ("Do NOT invoke any implementation skill until you have presented a design and the user has approved it" [4]). Forge's feedback_robust_over_quick.md is a text instruction, not a pre-code gate. The using-git-worktrees skill automatically creates an isolated branch and worktree before any implementation [8]. Forge has forge_worktree.sh but no automatic trigger.
Forge overlap. The /spawn pattern [9] replaces subagent-driven-development well for non-code tasks. The eval harness [10] catches post-hoc doctrine violations. The dispatcher worker pattern covers parallel agent fanning. For pure greenfield coding feature work, Superpowers adds enforced process that forge doctrine recommends but does not gate.
2. Context Mode¶
Two distinct repos exist: - kianwoon/context-mode [11]: 4 MCP tools + 2 hooks, simpler implementation, 1 star. - scottconverse/context-mode [12]: full port of mksglu/context-mode, 9 MCP tools + 6 lifecycle hooks, 3-stage compression pipeline, self-learning, session continuity, 1 star but updated 2026-05-03 and notably more mature.
The question's framing (sandbox interception, Bash/WebFetch/MCP, local SQLite, /contextmode:ctx-stats) maps to the scottconverse variant. Both share the same upstream architecture (mksglu/context-mode, Elastic License 2.0 [12]).
Install (scottconverse, recommended):
or in Claude Code:/plugin marketplace add scottconverse/context-mode
/plugin install context-mode@scottconverse-context-mode
/context-mode:ctx-doctor.
Install (kianwoon, simpler):
Requires Node.js 22+.Mechanism. Context Mode registers 6 lifecycle hooks [12]:
- PreToolUse: intercepts 18 patterns (Bash git-log/diff/test, WebFetch, Read on large files, curl/wget, build tools) and redirects to sandboxed ctx_execute / ctx_batch_execute / ctx_fetch_and_index. Safe invocations (piped, bounded) pass through unchanged [12].
- PostToolUse: captures all tool events into per-session SQLite.
- PreCompact: saves session snapshot before context compaction.
- SessionStart + UserPromptSubmit: inject routing block and session guide each turn.
- SubagentStop: cleanup.
SQLite event tracking schema. The knowledge base uses SQLite FTS5 with two virtual tables: a Porter Stemmer FTS5 table (BM25, title fields weighted 5x) and a Trigram FTS5 table (substring matching). Results merge via Reciprocal Rank Fusion (K=60) with proximity reranking. Per-session DB is context-mode-{pid}.db in tmpdir, deleted on session end. TTL eviction at 60 minutes per entry [12].
3-stage compression pipeline (scottconverse): 1. Deterministic ANSI/terminal stripping. 2. Pattern-based tool-output compression: 10 formatters for jest, pytest, git log, cargo, npm install etc. Passing tests and progress bars collapse; failures preserve verbatim. 3. Session-aware relevance filtering: keeps content related to current work [12].
The self-learning loop tracks what compressed content Claude later searches for and raises retention for frequently-retrieved tool patterns [12].
/contextmode:ctx-stats output. Reports token savings per session: number of intercepted tool calls, estimated tokens that would have entered context, tokens actually returned, percentage reduction, and learner accuracy [12].
Forge has nothing equivalent. Forge has /recall for semantic search over static files, but zero tool-output compression. Every Bash: git log, large Read:, or WebFetch: call dumps raw output into context. In long coordinator-bot dev sessions (reading 15+ files, running tests, fetching docs), this inflates context by 20-60k tokens and accelerates compaction. Context Mode addresses exactly this gap [12].
Known risk. The scottconverse fork is very new (1 star, created recently as a port). The upstream mksglu/context-mode project needs checking for maintenance status. The kianwoon fork is simpler but lacks the compression pipeline. Given Elastic License 2.0, commercial use in a self-hosted product is allowed but must preserve license attribution; personal forge use is unambiguously clear.
3. ClaudeMem (claude-mem)¶
Canonical repo: thedotmack/claude-mem [13].
Stars: 71k.
npm: [email protected] (AGPL-3.0) [14].
Author: Alex Newman.
Install:
Mechanism. ClaudeMem runs a persistent Bun worker service on port 37777 with a web viewer UI. 5 lifecycle hooks wire into Claude Code [15]:
| Hook | Call | Action |
|---|---|---|
| SessionStart | startup/clear/compact | starts worker service, then calls hook claude-code context to inject prior session summaries |
| UserPromptSubmit | every prompt | hook claude-code session-init |
| PostToolUse | every tool | hook claude-code observation (120s timeout, fires on every tool call) |
| PreToolUse (Read) | file reads | hook claude-code file-context |
| Stop | session end | hook claude-code summarize (120s timeout) |
The worker calls the Claude Agent SDK (runs on subscription quota, no separate API billing) to: (a) compress tool observations into semantic summaries, (b) extract decisions and lessons, and (c) on session end write a cross-session handoff [13].
Storage. SQLite DB for sessions, observations, summaries. Optional Chroma vector DB (via uvx chroma-mcp subprocess) for semantic search; Chroma is opt-in and adds a background MCP subprocess. Default local path: ~/.claude-mem/ [16].
Embedding model. When Chroma is enabled, embeddings are managed by chromadb's default embedding function (sentence-transformers/all-MiniLM-L6-v2, 384-dim). No custom embedding model is hardcoded in the source; the Chroma MCP subprocess handles embedding [16]. Without Chroma, search is FTS5 BM25 keyword-only.
3-layer retrieval API (MCP tools) [13]:
1. search - compact index, ~50-100 tokens per result, returns IDs.
2. timeline - chronological context around specific results.
3. get_observations - full details fetched by IDs (~500-1000 tokens per result).
This "index first, details by ID" pattern delivers ~10x token savings vs fetching full results immediately [13]. The same pattern applies to the mem-search skill.
Token/cost profile. PostToolUse fires on every tool call with a 120s timeout; this is the largest overhead. The Stop hook spawns a Claude Agent SDK call which bills subscription quota to summarize the session. In a heavy 50-tool session, this is 2-3 Sonnet calls. The SessionStart context injection pulls the last session's summaries (~500-2000 tokens).
Auto-generated folder-level CLAUDE.md. ClaudeMem does not generate folder-level CLAUDE.md files. It generates a session handoff injected at SessionStart, not CLAUDE.md. The KimYx0207/claude-memory-3layer tool [17] does generate per-project .claude/memory/MEMORY.md with lifecycle management; that is a separate project, not the canonical claude-mem.
Synthesis¶
The three tools address different problems: Superpowers is a development process enforcer; Context Mode is a token budget manager; ClaudeMem is a session memory accumulator. Forge already has strong coverage of the memory accumulator problem and partial coverage of process enforcement. It has zero coverage of token budget management.
Design patterns worth importing from ClaudeMem. The 3-layer retrieval pattern (index with IDs first, timeline for chronological context, details by explicit ID) is more token-efficient than forge's current /recall design, which returns full chunks immediately. For any future /recall v2 work, adopting search-returns-IDs, get-details-by-ID would reduce context consumption on multi-query research sessions.
Design patterns worth importing from Superpowers. The hard-gate pattern (HARD-GATE block in brainstorming skill that literally says "do not proceed without approval") is more reliable than text instructions. The writing-plans format (exact file paths, complete code, verification step, commit per task) is a higher-quality briefing structure than forge's current /spawn prompts. Consider adding a canonical worker-briefing template to forge/.claude/skills/ that follows this structure.
Context Mode fills a real gap. No current forge tool prevents raw tool output from inflating context. On a 4-hour coordinator bot dev session reading 20+ files and running tests, uncompressed output could push context to compaction 2x sooner. Context Mode's hook-based interception is zero-config once installed.
Forge-stack comparison¶
Superpowers¶
| Capability | Superpowers provides | Forge equivalent | Verdict |
|---|---|---|---|
| Pre-code design gate | Hard HARD-GATE block, spec must be approved before impl | feedback_robust_over_quick.md (text instruction only) |
Superpowers stronger |
| Subagent dispatch per task | subagent-driven-development skill, fresh context per task |
/spawn pattern + dispatcher workers |
Equivalent |
| Two-stage code review | Spec compliance reviewer + code quality reviewer subagents | Eval harness (post-commit, doctrine-only) | Superpowers stronger for code quality |
| TDD enforcement | RED-GREEN-REFACTOR hard gates, delete code written before tests | No equivalent; TDD optional | Superpowers only |
| Git worktree isolation | using-git-worktrees skill auto-creates isolated branch |
forge_worktree.sh (manual invocation) |
Superpowers more automatic |
| Plan documentation | docs/superpowers/plans/ with task checkboxes |
/spawn prompt informal |
Superpowers stronger |
| Token overhead | ~1.4k startup + 2-5k per skill lazy-loaded | N/A | Low overhead |
| Doctrine alignment | Skills override defaults, user CLAUDE.md has priority | CLAUDE.md is canonical | Compatible |
Context Mode (scottconverse variant)¶
| Capability | Context Mode provides | Forge equivalent | Verdict |
|---|---|---|---|
| Tool output compression | 3-stage pipeline, 10 format-aware compressors | None | Context Mode only |
| Bash interception (git log/diff, test runners) | PreToolUse hook denies/redirects 18 patterns | None | Context Mode only |
| WebFetch interception | Redirects to fetch_and_index with 24h TTL cache | None | Context Mode only |
| FTS5 knowledge base per session | BM25 + trigram + RRF fusion search over session content | /recall (cross-session, permanent) | Different scope, both useful |
| Session continuity after compaction | PreCompact snapshot + structured Session Guide rebuild | Auto-dream nightly consolidation | Context Mode handles intra-session; forge handles cross-session |
| Context savings reporting | /contextmode:ctx-stats with token and dollar estimates | None | Context Mode only |
| Self-learning retention | Feedback loop raises retention for frequently-retrieved patterns | None | Context Mode only |
| Process isolation | Subprocess with stdout cap, no filesystem restriction | Dispatcher workers (full isolation) | Different scope |
ClaudeMem¶
| Capability | ClaudeMem provides | Forge equivalent | Verdict |
|---|---|---|---|
| Session observation capture | PostToolUse hook, every tool, Claude Agent SDK compression | auto-memory (Stop hook, confidence-gated) | Forge more conservative (confidence gate), ClaudeMem more aggressive |
| Cross-session memory injection | SessionStart context injection, last-session summaries | auto-memory writes to MEMORY.md topic files, auto-loaded | Equivalent, different format |
| Semantic search | Optional Chroma vector search (MiniLM-L6-v2, 384-dim) | /recall (BGE-small-en-v1.5, 384-dim, sqlite-vec) | Equivalent embedding dim; forge integrated, ClaudeMem optional |
| 3-layer retrieval (index/timeline/details) | MCP tools: search, timeline, get_observations | /recall returns full chunks immediately | ClaudeMem design more token-efficient |
| Nightly consolidation | None (worker processes continuously) | auto-dream (dedup, stale prune, promote staged) | Forge stronger |
| Memory audit trail | None | LESSONS.md append-only audit, forge_memory_revert.py | Forge only |
| Memory revert | None | forge_memory_revert.py --session |
Forge only |
| Sensitive content protection | <private> tag manual exclusion |
Pre-redaction of secrets before Sonnet, safe_path() guard | Forge stronger |
| Persistent background service | Bun worker on port 37777 | N/A (cron-based) | Different arch; ClaudeMem higher operational complexity |
| Web viewer UI | http://localhost:37777 | None | ClaudeMem only |
| AGPL-3.0 license | Yes | N/A | Copyleft; fine for personal forge use |
Recommendations¶
Superpowers: Steal-design-only.
Forge already has equivalent process enforcement via doctrine plus /spawn plus eval harness. Superpowers would add genuine value only for disciplined greenfield coding features, but Justin's workflow (Telegram-driven, operator-heavy, few greenfield CLI tools) doesn't match the brainstorm-plan-TDD pattern. The install adds ~1.4k tokens of session overhead every session. More importantly, the hard-gate and worker-briefing document patterns should be stolen into forge's own skill system: add a forge/.claude/skills/feature-plan/SKILL.md that enforces spec-before-code and uses the writing-plans task structure.
Context Mode: Install. The scottconverse variant is recommended.
/plugin marketplace add scottconverse/context-mode
/plugin install context-mode@scottconverse-context-mode
/context-mode:ctx-doctor
This fills a genuine gap: zero tool-output compression exists in forge today. Long coordinator-bot dev sessions and research-heavy remote-bridge sessions will benefit immediately. The hook intercept is silent and zero-config; it does not interfere with forge's existing hooks or CLAUDE.md rules. The CLAUDE.md shipped with the plugin uses "Think in Code" directive and tool-selection rules that complement forge doctrine. Risk: scottconverse fork is very new. If stability is a concern, kianwoon/context-mode (claude plugin add kianwoon/context-mode) is a lighter alternative with 4 tools and 2 hooks.
ClaudeMem: Skip. The Bun persistent background worker, Chroma subprocess, AGPL license complication, and PostToolUse hook firing on every single tool call (120s timeout budget) adds operational complexity that forge's auto-memory system already covers at lower cost. ClaudeMem's biggest risk is the PostToolUse hook competing with forge's existing PostToolUse hooks. Forge's confidence-gated auto-memory is more conservative (fewer false writes) and has audit/revert tooling ClaudeMem lacks.
Design pieces to steal from ClaudeMem into forge /recall v2:
1. Index-first retrieval: change /recall to return a compact result index (title, date, 1-line summary, ID) in the first call, then offer --get-details <id> for full chunk text. This would cut /recall context consumption by ~70% on multi-result queries.
2. Timeline tool: given an ID, return the N entries before and after it chronologically. Useful for "what was the context around this decision" queries.
Disagreements / open questions¶
-
Context Mode maintainer. Both
kianwoon/context-modeandscottconverse/context-modeare ports/derivatives ofmksglu/context-mode. The upstream mksglu project is not found in the search results; its maintenance status is unverified. If the upstream is abandoned, the ports may diverge without coordination. -
Superpowers token overhead varies by platform. The README claims skills are lazy-loaded, but the
using-superpowersSessionStart injection contains the bootstrap text unconditionally. Some community forks (jnMetaCode/superpowers-zh) inject all 14 skills at startup. The obra/superpowers canonical repo is lazy. Verify behavior after install with/context-mode:ctx-statsor by counting tokens in the first session turn. -
ClaudeMem version mismatch. npm shows
[email protected]but the GitHub repo plugin.json shows12.5.0as well while the package.json README badge shows6.5.0. The npm version appears to reflect a different numbering scheme. This is unresolved;npx claude-mem installshould install the latest npm-published version regardless. -
Context Mode license (Elastic License 2.0). ELv2 prohibits providing the software as a managed service. Personal forge use is permitted; embedding it in a commercial SaaS product would require re-evaluation.
Sources¶
- obra/superpowers GitHub repo, 177k stars, canonical source for install commands, skills list, mechanism.
- obra/superpowers-marketplace README, marketplace structure and plugin catalog.
- obra/superpowers session-start hook + using-superpowers SKILL.md, lazy-load confirmation and bootstrap injection mechanism.
- brainstorming SKILL.md, HARD-GATE block, spec approval flow, 9-step checklist.
- writing-plans SKILL.md, task structure, 2-5 min granularity, TDD steps.
- subagent-driven-development SKILL.md, two-stage review loop, model selection guidance.
- test-driven-development SKILL.md, RED-GREEN-REFACTOR hard gates, "Iron Law" delete requirement.
- using-superpowers SKILL.md, skill priority order, platform adaptation, Red Flags list.
- Forge reference_task_queue.md and /spawn pattern, forge dispatcher + spawn equivalent.
- Forge reference_eval_harness.md, 12-check eval harness pre-commit and nightly.
- kianwoon/context-mode GitHub repo, simpler 4-tool variant, README with FTS5 schema, hook table.
- scottconverse/context-mode GitHub repo, full README with 9-tool MCP, 18-rule routing table, 3-stage compression, SQLite schema, security model, install command.
- thedotmack/claude-mem GitHub repo, canonical claude-mem, 71k stars, README with 3-layer MCP workflow, hooks architecture.
- claude-mem on npm, v12.5.0, AGPL-3.0, install command.
- claude-mem plugin/hooks/hooks.json, 5 lifecycle hook registrations with exact commands and timeouts.
- claude-mem chroma-flowcharts.md, Chroma vector DB integration, ChromaSync write/read paths, embedding via chroma-mcp subprocess.
- KimYx0207/claude-memory-3layer, 3-layer JSON+MD+MD architecture, lifecycle management, git-trackable, token-efficient (~1500 tokens), one-line install - referenced for design comparison.
- coleam00/claude-memory-compiler, 975 stars, Karpathy LLM knowledge base architecture adapted for Claude Code sessions, no RAG design rationale (index beats cosine similarity at personal scale).
- severity1/claude-code-auto-memory, 140 stars, auto-managed CLAUDE.md sections via PostToolUse + Stop hook pattern.
- yoloshii/ClawMem, 150 stars, hybrid RAG memory (BM25 + vector + RRF + cross-encoder reranking), SAME/MAGMA/A-MEM architecture, for contrast with ClaudeMem.
Search trail¶
GitHub API search: superpowers claude code- found obra reposobra/superpowers-marketplace README fetch- confirmed install command and plugin listobra/superpowers README fetch- core mechanism, skills list, workflow stepsGitHub API search: context mode claude code plugin- found kianwoon and scottconverseGitHub API search: claudemem claude memory- found fragmentary resultskianwoon/context-mode README fetch- simpler variant mechanismscottconverse/context-mode README fetch- full variant with routing table, compression, hooksobra/superpowers skills directory listing- confirmed 14 skillsobra/superpowers hooks/session-start fetch- confirmed lazy-load mechanismobra/superpowers individual skill sizes- token overhead calculationGitHub API search: claude-mem memory plugin code stars- found coleam00, ClawMemnpm registry: claude-mem- found thedotmack/claude-mem as canonicalthedotmack/claude-mem README fetch- 71k stars, 5 hooks, 3-layer MCP, Chromathedotmack/claude-mem plugin/hooks/hooks.json fetch- confirmed 5 hook registrations and timeoutsthedotmack/claude-mem chroma-flowcharts.md fetch- Chroma vector DB architectureKimYx0207/claude-memory-3layer README fetch- 3-layer JSON+MD+MD architecturecoleam00/claude-memory-compiler README + AGENTS.md fetch- Karpathy architecture, no-RAG rationaleobra/superpowers using-superpowers SKILL.md + hooks.json fetch- confirmed single SessionStart hook, lazy skill loadingforge auto-memory and eval harness reference files read- accurate forge-stack comparison baseline