Skip to content

Pure Phoenix Phase 4.2, Bot Redesign (sub-design pass)

This is the design handoff that gates Phase 4.2 execution. Phase 4.2 in the master plan was originally scoped as rename-and-port; Justin signaled 2026-04-28 that the bot fleet needs a full redo. This document captures the inventory, recommended architecture, and seven open decisions that need Justin's sign-off before any code changes.

Plan: ~/.claude/plans/yes-lets-go-into-pure-phoenix.md Section 4.2. Doctrine: FORGE-DOCTRINE.md Sections 3, 5, 6, 7.

Current State Inventory

Bot identities (5 active)

Bot Role Service Code Token at
@jw_inbox_bot low-friction capture, voice + text forge-telegram-inbox.service forge_telegram_inbox_bot.py (407 lines) ~/.forge-secrets/telegram-inbox.env
@Ava_JForgeBot lifeos coordinator, deep synthesis forge-telegram-ava.service forge_telegram_lifeos_coordinator_bot.py (207) ~/.forge-secrets/telegram-ava.env
(inbox webhook) iOS Shortcut endpoint, port 7400 forge-inbox-webhook.service forge_telegram_inbox_webhook.py (172) (uses inbox token)
@Manager_JForgeBot push-only output for notify.sh n/a (push only) n/a ~/.forge-secrets/telegram-manager.env
@jw_updates_bot push-only output for heartbeat + morning-brief n/a (push only) n/a ~/.forge-secrets/telegram-updates.env

Brain layer (Sonnet 4.6)

Brain LOC Model Tools Notes
forge_telegram_inbox_brain.py 1120 claude-sonnet-4-6 via claude -p 28 (full surface) The mothership. Notion (16 DBs), Calendar, Gmail (personal + business), wellness, knowledge, habits, scheduled nudges, push, spawn_worker, drafts.
forge_telegram_lifeos_coordinator_brain.py 235 claude-sonnet-4-6 via claude -p imports inbox_brain.TOOLS_JSON, identical 28 Same tool surface, different system prompt that emphasizes synthesis and spawn_worker for heavy work.

Functional difference between the two brains today is prompt-only. Both have access to the identical tool registry. The "lifeos vs inbox" split is a tone choice, not a capability choice.

Supporting infrastructure

Component Role
forge_telegram_transcribe.py faster-whisper base.en CPU int8, voice-to-text
forge_telegram_push.sh unified push helper, takes chat\|updates\|inbox\|ava\|manager selector
forge_telegram_nudge_fire.py every-minute cron, fires scheduled nudges as push messages to the updates bot
data/inbox-context.jsonl rolling conversation context for inbox brain
data/ava-context.jsonl rolling conversation context for lifeos brain

Doctrine compliance issues today

Section Violation Severity
3 (no persona names for bots) @Ava_JForgeBot, @Manager_JForgeBot known, scheduled for retirement here
5 (inbox is low-friction capture only, downstream workers do heavy lifting) inbox brain has spawn_worker, full Gmail tools, Notion CRUD on 16 DBs. It is NOT low-friction; it does heavy lifting. high, scope creep
7 (per-bot API usage tracking) zero tracking exists for any of the 5 bots unimplemented from day one

One brain module, two personas

Collapse inbox_brain and lifeos_brain into a single forge_telegram_brain.py parameterized by persona:

def handle(text: str, *, persona: str, prior_messages: list[dict] | None = None) -> str:
    """persona in {'capture', 'coordinator'}"""

Persona selects: - system prompt template - tool subset (capture: 8 tools; coordinator: 28+) - default done behavior (capture: done:true after action; coordinator: multi-pass allowed) - context log file (data/capture-context.jsonl vs data/coordinator-context.jsonl) - per-call cost budget guard

Each bot's polling layer becomes thin:

# forge_telegram_capture_bot.py
import forge_telegram_brain as brain
reply = brain.handle(text, persona="capture")

Eliminates ~200 lines of duplication. Future personas (e.g. "alert-triage", "morning-recap") drop in without new brains.

Capture vs coordinator tool split (per Section 5)

Tool capture coordinator
save_to_inbox yes yes
create_task (quick) yes yes
save_knowledge (quick) yes yes
schedule_nudge yes yes
log_habit yes yes
wellness_now (read-only) yes yes
create_calendar_event (quick add) yes yes
voice transcribe yes yes
query_notion NO (capture is write-only) yes
update_task NO yes
list_calendar_events NO yes
update_calendar_event NO yes
delete_calendar_event NO yes
Gmail tools (search, read, draft, archive, label, etc.) NO yes
email_to_task NO yes
spawn_worker NO yes
push_updates NO yes

Capture stays a fire-hose for "throw thoughts in"; coordinator handles "go do something with these thoughts." Aligns with Section 5.

Cost discipline (Section 7)

New shared module forge_telegram_brain_metrics.py: - Wraps every claude -p call: records timestamp, persona, persona-call-id, prompt token estimate, response token estimate, latency_ms, success bool. - Writes JSONL at forge/data/telegram-cost/<persona>-YYYY-MM.jsonl. - Daily cron at 04:30 aggregates to summary.json (per-persona daily/weekly totals + cost estimates using Sonnet 4.6 pricing). - Optional notify when daily spend exceeds a threshold; default $5/day per persona, configurable in eval.json.

Token counting: use prompt char count / 4 as the conservative input-token estimate (until we wire the Anthropic SDK for exact counts; Phase 4.2b).

New bot identities (Apple-Dictation friendly + doctrine-compliant)

Doctrine Section 3 rules: no persona names for bots; Apple-Dictation friendly; cycle and destroy retired names. Long compound names like forge_inbox_capture_bot are doctrine-compliant but Apple stumbles on them.

Old New Function
@jw_inbox_bot @forge_capture_bot capture persona (write-only Notion + calendar quick-add + voice)
@Ava_JForgeBot @forge_assist_bot coordinator persona (full tool surface, deep synthesis)
@Manager_JForgeBot @forge_alert_bot push-only (notify.sh output)
@jw_updates_bot merge into @forge_alert_bot retire as separate identity; alert handles both

Result: 3 bot identities (down from 5). All single-syllable function words after forge_. Apple Dictation handles "forge capture bot," "forge assist bot," "forge alert bot" cleanly.

If Justin wants to keep updates separate (e.g. wellness-flavored daily push doesn't pollute alert priority queue), he can keep @jw_updates_bot as @forge_status_bot for a fourth identity. Recommend NOT splitting; one alert bot keeps the surface narrow.

Service topology

Service Replaces Runs
forge-capture-bot.service forge-telegram-inbox.service long-poller for @forge_capture_bot
forge-assist-bot.service forge-telegram-ava.service long-poller for @forge_assist_bot
forge-inbox-webhook.service unchanged name (port 7400) iOS Shortcut endpoint, calls capture persona
(no service for alert) manager + updates retired push-only, no listener needed

Three long-pollers down from two-plus-webhook. Service names doctrine-compliant.

Token strategy

Each new bot gets a fresh BotFather token. Old tokens revoked. Old bot identities deleted via @BotFather as the final step. Cost: Telegram chat history with the old bots is not migrated. Justin loses the visible chat log from @jw_inbox_bot and @Ava_JForgeBot. The forge inbox-context.jsonl and ava-context.jsonl log files survive (forge-side memory continuity).

Alternative: rotate tokens within existing identities (rename via @BotFather). Keeps chat history. Loses doctrine cleanliness. Recommend fresh-tokens path; doctrine Section 3 explicitly says "cycle and destroy old names."

Seven Open Decisions (need sign-off)

# Decision Recommendation Cost of choosing differently
1 Functional split: keep capture + coordinator as two bots, or consolidate into one with mode commands? Keep two bots. Visual UX matters; inbox-as-low-friction is doctrine. Consolidate = one less service, one less token, but you lose the visual separation in Telegram between "capture" and "act".
2 Brain unification: one shared brain module with persona param vs two distinct brain modules? One module, two personas. Drops ~200 lines of duplication. Two brains lets you diverge prompts/tools faster but pays maintenance tax.
3 Tool surface for capture: shrink to capture-only (8 tools), or keep full surface? Shrink to capture-only. Aligns with Section 5. Full surface lets you act fast from Telegram on anything but breaks the "downstream workers do heavy lifting" principle.
4 Token strategy: fresh BotFather tokens (lose chat history) or rotate within existing identities? Fresh tokens. Doctrine "cycle and destroy" wins; chat history of voice notes isn't load-bearing. Rotate keeps chat history but feels like rebrand-not-redo and conflicts with Section 3.
5 Bot names: long doctrine-strict (@forge_inbox_capture_bot) vs short Apple-Dictation-friendly (@forge_capture_bot)? Short. Both are doctrine-compliant; Apple Dictation matters daily. Long names are unambiguous but you'll fight your phone every time you say them.
6 Updates bot fate: merge into alert bot or keep separate as @forge_status_bot? Merge. Two push-only bots is unnecessary surface. Keep separate if you want morning-brief / heartbeat in a different chat thread from critical alerts.
7 Cost discipline scope for v1: ship with conservative char-count token estimates (cheap, fast), or wait to wire Anthropic SDK for exact token counts? Ship char-count. Phase 4.2b can add exact counts. Wait for SDK = no cost data for weeks; ship now = cost data within hours, accuracy ±20%.

Execution Plan (after sign-off)

Wave 4.2.A: greenfield brain 1. forge_telegram_brain.py with persona-aware handle(). Migrate inbox brain logic in. 2. forge_telegram_brain_metrics.py shared cost wrapper. 3. forge/data/telegram-cost/ directory + daily cron. 4. Unit tests against transcripts (replay inbox-context.jsonl with new brain).

Wave 4.2.B: new bot identities 5. Justin creates new bots via @BotFather, drops tokens at ~/.forge-secrets/telegram-{capture,assist,alert}.env. 6. Update forge_telegram_push.sh selector mapping. 7. Update notify.sh to point at @forge_alert_bot.

Wave 4.2.C: new pollers 8. forge_telegram_capture_bot.py (thin polling wrapper, calls brain.handle(text, persona="capture")). 9. forge_telegram_assist_bot.py (same shape, persona="coordinator"). 10. forge_telegram_inbox_webhook.py updated to call capture persona. 11. New systemd units forge-capture-bot.service, forge-assist-bot.service. daemon-reload + start.

Wave 4.2.D: cutover 12. Stop + disable old services (forge-telegram-inbox, forge-telegram-ava). 13. Migrate any active context (inbox-context.jsonl to capture-context.jsonl, etc.). 14. Justin deletes old bots via @BotFather (irreversible; final-confirm step). 15. Update eval harness whitelists; remove forge-telegram-{ava,inbox}.service from forge_eval_check_service_names.sh; remove Ava_JForgeBot / Manager_JForgeBot references from forge_eval_check_persona_code.sh if any. 16. Update MEMORY.md telegram section + system-map fleet.md + CLAUDE.md system mental model. 17. Smoke test: voice note to capture bot lands in Notion Inbox; text question to assist bot returns coordinated answer; notify.sh warning ... arrives at alert bot.

Wave 4.2.E: post-soak 18. After one week of clean operation, close the LESSONS.md persona-name violation; tighten eval check severity for service-name and persona-code from warning/error to fatal.

Risks

  • iOS Shortcut breakage. The Shortcut hits https://inbox.justinsforge.com/... which proxies to the inbox webhook on port 7400. If we change the webhook path or bot wiring, the Shortcut breaks. Plan: keep the webhook URL stable, change only the brain it calls.
  • Voice transcription latency. faster-whisper on CPU takes 2-5s for a 30s clip. Capture bot returns ack ("got it") immediately, then runs the brain async. Coordinator can wait inline since the user is in conversation mode.
  • Cost surprise. No tracking today means no baseline. First day of cost-discipline tracking might show the bots are spending more than expected. Set a generous initial threshold ($10/day per persona) and tighten after seeing real numbers.
  • Telegram rate limits. 30 messages/sec/bot, way above any plausible forge usage.

Decision Trigger

When Justin signs off on the seven decisions above, this design becomes execution-ready. The execution waves (4.2.A through 4.2.E) become a fresh sub-handoff pure-phoenix-phase-4-2-execution-2026-XX-XX.md covering the actual code changes.


Sign-off Locked 2026-04-28T22:55

# Decision Locked answer
1 Functional split Three bot identities: capture (long-poll), coordinator (long-poll), one merged push bot (combines today's @Manager + @jw_updates)
2 Brain unification One shared forge_telegram_brain.py module with persona param
3 Capture tool surface Shrink to capture-only (8 tools); grow capture-specific tools later if friction surfaces
4 Token strategy Fresh BotFather tokens; retire old bots after cutover
5 Bot names LONG doctrine-strict (override Apple-Dictation pragmatism). Final names: @forge_inbox_capture_bot, @forge_lifeos_coordinator_bot, @forge_notify_outbound_bot
6 Updates bot fate Merged into @forge_notify_outbound_bot (drops separate @jw_updates_bot identity)
7 Cost discipline scope System-wide quota observability (NOT just bots). See expanded scope below.

Expanded scope: system-wide claude quota observability

Justin reframed Q7: the goal is to understand "what actions, automations, chats, development is costing me in my Pro Max quota." That's a forge-wide observability concern, not a bot-only metric. Phase 4.2 ships the metrics MODULE; full instrumentation is Phase 4.7 below.

Existing telemetry surfaces

Surface What it has Loaded via
~/.claude/stats-cache.json Daily token counts per model, message counts, session counts, tool-call counts. Native Claude Code statistics. populated by Claude Code on session end
scripts/prompt-counter.sh Per-session prompt counter, fires every UserPromptSubmit hook. Currently only a checkpoint reminder; could record per-prompt metadata. UserPromptSubmit hook
individual forge scripts nothing today

Forge scripts that call claude -p (9 confirmed)

Script Frequency
forge_telegram_inbox_brain.py (the new capture brain) per Telegram message
forge_telegram_lifeos_coordinator_brain.py (the new coordinator brain) per Telegram message
forge_dispatcher.sh per pipe-mode worker task
forge_memory_auto_capture.py per session end (Stop hook)
forge_memory_auto_dream.py nightly cron 04:00
morning-brief.py daily cron 07:00 (12:00 UTC)
forge_wellness_daily_summary.py daily cron 03:00
heartbeat.py midday + evening + night-cap (3x/day)
forge_tmux_anchor_session.sh once per boot (low frequency)

Phase 4.2 scope (the metrics module)

Build forge/scripts/forge_quota_tracker.py. Single function:

def record(invoker: str, model: str, prompt_chars: int, response_chars: int,
           latency_ms: int, success: bool, extra: dict | None = None) -> None:
    """Append one record to forge/data/claude-quota/<YYYY-MM>.jsonl."""

Used by the new bot brains as part of Phase 4.2. Hourly + daily aggregator runs from cron, writes summary to forge/data/claude-quota/summary.json.

Surface in the existing /recall index? No, the JSONL grows daily and would dominate embeddings. Kept as raw data behind a CLI.

Phase 4.7 (NEW, system-wide quota observability)

Sub-handoff pure-phoenix-phase-4-7-quota-observability-2026-XX-XX.md covers:

  1. Instrument the other 7 forge scripts (one-line forge_quota_tracker.record(...) per claude -p call site).
  2. Daily aggregator that MERGES ~/.claude/stats-cache.json (interactive sessions) + forge/data/claude-quota/*.jsonl (forge-script invocations) into one unified summary.
  3. Optional weekly digest pushed to @forge_notify_outbound_bot showing "this week your Pro Max quota was X% spent on bots, Y% on automations, Z% on interactive sessions."
  4. Auto-back-off behavior: if call rate spikes above threshold, bots queue messages or shrink prompt context. Not in scope for 4.2.

Phase 4.7 is ENABLED by Phase 4.2 (the metrics module is the foundation). Recommend executing them in series: 4.2 ships the new bot fleet + tracker module; 4.7 ships the system-wide instrumentation + aggregator.

Updated Execution Plan (locked, three bots, long names)

Wave 4.2.A: greenfield brain + tracker 1. forge_telegram_brain.py with persona-aware handle(). Migrate inbox brain logic in. 2. forge_quota_tracker.py shared metrics module. 3. forge/data/claude-quota/ directory + daily aggregator cron at 04:30. 4. Replay test: feed data/inbox-context.jsonl snippets through new brain, verify same tool calls.

Wave 4.2.B: new bot identities 5. Justin creates new bots via @BotFather: - @forge_inbox_capture_bot - @forge_lifeos_coordinator_bot - @forge_notify_outbound_bot 6. Justin drops tokens at: - ~/.forge-secrets/telegram-inbox-capture.env - ~/.forge-secrets/telegram-lifeos-coordinator.env - ~/.forge-secrets/telegram-notify-outbound.env 7. Update forge_telegram_push.sh selector mapping to new env files. 8. Update forge_notify.sh to point at @forge_notify_outbound_bot (token replaces the current Manager bot wiring). 9. Update morning-brief.py, heartbeat.py, forge_telegram_nudge_fire.py to push to @forge_notify_outbound_bot (currently they push to @jw_updates_bot).

Wave 4.2.C: new pollers + webhook 10. forge_telegram_inbox_capture_bot.py (thin polling wrapper, calls brain.handle(text, persona="capture")). 11. forge_telegram_lifeos_coordinator_bot.py (same shape, persona="coordinator"). 12. forge_telegram_inbox_capture_webhook.py (renamed from inbox_webhook, now points at capture persona). 13. New systemd units: - forge-inbox-capture.service (replaces forge-telegram-inbox.service) - forge-lifeos-coordinator.service (replaces forge-telegram-ava.service) - forge-inbox-capture-webhook.service (renamed from forge-inbox-webhook.service; iOS Shortcut endpoint stays at port 7400 to keep Justin's Shortcut working) 14. sudo systemctl daemon-reload + start.

Wave 4.2.D: cutover 15. Stop + disable old services (forge-telegram-inbox, forge-telegram-ava, forge-inbox-webhook). 16. Migrate context: inbox-context.jsonl -> capture-context.jsonl, ava-context.jsonl -> coordinator-context.jsonl. 17. Justin deletes old bots via @BotFather: @jw_inbox_bot, @Ava_JForgeBot, @Manager_JForgeBot, @jw_updates_bot (irreversible, final-confirm step). 18. Update eval harness whitelists in forge_eval_check_service_names.sh and forge_eval_check_persona_code.sh. After 4.2 ships, those whitelists are EMPTY. 19. Update MEMORY.md telegram section + system-map fleet.md + CLAUDE.md system mental model. 20. Smoke tests: - voice note to @forge_inbox_capture_bot lands in Notion Inbox - text question to @forge_lifeos_coordinator_bot returns coordinated answer - forge_notify.sh warning ... arrives at @forge_notify_outbound_bot - morning-brief pushes to @forge_notify_outbound_bot - quota tracker records calls in forge/data/claude-quota/2026-04.jsonl

Wave 4.2.E: post-soak 21. After one week clean, close LESSONS.md persona-name violation. 22. Tighten eval check severity: naming-taxonomy-services from warning to error; no-persona-names-in-code allowlist removes Ava/Manager. 23. Trigger Phase 4.7 sub-handoff for system-wide quota instrumentation.

[Claude Code, Pure Phoenix Phase 4.2 design pass; sign-off 2026-04-28T22:55]