Skip to content

title: Pure Phoenix Phase 4.9, Troubleshoot-to-Guardrail Pipeline date: 2026-04-30 status: design, awaiting sign-off owner: Justin phase: 4.9 (addendum to Pure Phoenix plan at ~/.claude/plans/yes-lets-go-into-pure-phoenix.md)


Pure Phoenix Phase 4.9: Troubleshoot-to-Guardrail Pipeline

URL: https://mkdocs.justinsforge.com/memory/handoffs/pure-phoenix-phase-4-9-troubleshoot-to-guardrail-2026-04-30/

Why this exists

Forge already has reactive learning surfaces (LESSONS.md, feedback_*.md topic files, the eval harness, auto-dream). What it lacks is the loop from "issue surfaced in a live session" to "guardrail merged". Today, in-session bugs get fixed, sometimes a memory gets written, occasionally a hook gets added, but there is no contract that says "when X breaks twice, a guard ships." Lessons accumulate, prevention does not.

Phase 4.9 closes that loop with three small additions: a structured incident schema in LESSONS.md, a tiny script to write entries, and a weekly auto-dream nag that surfaces recurring lessons without guards.

Design principle: seen-twice rule

A single bug does not justify a guardrail. Premature systematization creates abstraction sludge that is harder to evolve than the bug it prevents. The bar:

Occurrences Action
1 Log the lesson. No guard.
2+ (or 1 with high blast radius: data loss, security, silent corruption) Propose a concrete guard with file path, ship within one session.

"High blast radius" is judgment-call territory. The doctrine clause below names the four trigger categories so it is not arbitrary.

The incident loop

When something breaks mid-session, the bot runs four steps before moving on:

  1. Root-cause one line. Format: "cause was X because Y." Not "I fixed it." If the bot cannot state the cause cleanly, it has not understood the bug yet.
  2. Log the lesson. Call forge_incident_log.py (new, see schema below). Writes to LESSONS.md and increments seen_count if a matching incident_id already exists.
  3. Decide guard tier. seen_count == 1 and not high-blast → stop here. Otherwise propose the guard inline (hook, eval check, doctrine clause, boundary sanitizer) with the exact file path. Ship it the same session.
  4. Link forward. Lesson entry includes a guard: field referencing the file/line of the guard. The guard's comment references the incident_id. Future audits trace the chain in either direction.

LESSONS.md schema extension

Current entries are free-form prose with **Doctrine:** / **Decision:** / **Owner:** headings. Phase 4.9 adds a structured frontmatter block per incident, parseable by the eval harness:

## YYYY-MM-DDTHH:MM [incident_id] one-line title

- **doctrine:** Section X (rule name) | n/a
- **eval_check:** check-name | none
- **incident_id:** kebab-case-stable-key
- **seen_count:** N
- **first_seen:** YYYY-MM-DD
- **last_seen:** YYYY-MM-DD
- **blast_radius:** low | medium | high (data-loss, security, silent-corruption, customer-visible)
- **guard:** path/to/guard.py:LN | path/to/hook | doctrine:Section-X | none (single-occurrence)
- **guard_status:** shipped | proposed | not-needed | overdue

### Root cause
One line.

### Fix
What changed in code (file paths, what flipped).

### Recurrence prevention
If guard_status == shipped, what stops this from happening again. If guard_status == proposed or overdue, the proposed mechanism and target date.

Existing entries stay as-is; new entries follow the schema. The eval harness gets a new check lessons-md-schema-conformance that warns (does not block) when new entries omit fields.

New script: forge_incident_log.py

Path: forge/scripts/forge_incident_log.py. Single-purpose: append or upsert a lesson entry.

forge_incident_log.py \
  --id "telegram-bot-subprocess-thread-race" \
  --title "Errno 8 from threading.Thread + subprocess.run race" \
  --doctrine "n/a" \
  --blast medium \
  --root-cause "threading.Thread + requests.post raced with main-thread subprocess.run, fork() in subprocess saw inconsistent fd state" \
  --fix "scripts/forge_telegram_*.py: replaced Thread heartbeat with multiprocessing.Process" \
  --guard "scripts/forge_text_sanitize.py" \
  --guard-status shipped

Behavior: - If incident_id already exists in LESSONS.md, increment seen_count, update last_seen, append a sub-bullet under "Recurrence" with the new occurrence date. Do not duplicate the entry. - If new, write a fresh block following the schema. - All flags optional except --id and --title; missing fields render as tbd. - Emits the entry id to stdout so the caller can reference it in commit messages.

Auto-dream weekly nag

Auto-dream (Phase 4.4 nightly consolidation) gets a new pass: scan LESSONS.md for entries where seen_count >= 2 and guard_status in (proposed, none). Once per week (Sunday consolidation), compile a list and route it to coordinator chat as a single notify:

3 lessons recurring without guards: [incident-id-1] (seen 4x, last 2026-04-27), [incident-id-2] (seen 2x, last 2026-04-29), [incident-id-3] (seen 3x, last 2026-04-30). Want me to draft guards?

The bot does not auto-build guards. It surfaces the backlog so Justin (or a worker session) can act. Silent automated guard-building violates the seen-twice judgment requirement: the human keeps the call on whether the recurrence justifies prevention infrastructure.

Implementation: extend forge/scripts/forge_auto_dream.py (or wherever weekly consolidation lives, TBD by Phase 4.4 owner) with a lessons_recurrence_scan() function. Output channels through /notify warning.

Doctrine amendment: Section 10 addendum

Section 10 currently covers self-iteration via eval harness and LESSONS.md. Phase 4.9 adds a sub-clause:

10.4 Troubleshoot-to-Guardrail Loop. When an issue surfaces in a live session, the four-step loop runs before moving on: state the root cause in one line, log via forge_incident_log.py, decide guard tier per the seen-twice rule, link forward. High-blast-radius categories trigger a guard on first occurrence: data loss, security, silent corruption, customer-visible regression. Single low-blast occurrences log only. The eval harness check lessons-md-schema-conformance warns on schema drift but does not block commits.

The sub-clause goes into FORGE-DOCTRINE.md Section 10 alongside existing self-iteration protocol. This is the contract; without it, the loop becomes optional and decays.

Eval harness check: orphan-lessons

New check, lands in Phase 4.9 alongside the script:

  • name: lessons-orphan-recurrence
  • rule: No entry in LESSONS.md with seen_count >= 3 and guard_status in (proposed, none) older than 14 days.
  • severity: warning (initial), tightens to error after a clean week per the same policy as no-em-dashes.
  • rationale: Three recurrences over two weeks is a clear signal that a guard is overdue, regardless of blast radius. This is the eval-harness teeth behind the auto-dream nag.

Files this phase touches

File Change
forge/FORGE-DOCTRINE.md Add Section 10.4
forge/LESSONS.md Schema applies to new entries only; existing entries grandfathered
forge/scripts/forge_auto_memory.py Add log_incident() function (single owner of LESSONS.md writes)
forge/scripts/forge_incident_log New thin CLI shim that calls forge_auto_memory.log_incident()
forge/scripts/forge_auto_dream.py Add lessons_recurrence_scan()
forge/scripts/forge_eval_check_lessons_orphan_recurrence.py New
forge/scripts/forge_eval_check_lessons_md_schema.py New
forge/eval.json Register two new checks
forge/MEMORY.md index Add [Incident loop](memory/general/reference_incident_loop.md) entry
forge/memory/general/reference_incident_loop.md New, topic file documenting the script + schema for future sessions

Sequencing and gates

Phase 4.9 lands after Phase 4.5 (eval harness, already shipped) and Phase 4.4 (auto-memory + auto-dream, in progress). It does not depend on the bot redesign (Phase 4.2) or Drive redesign (Phase 4.6); those are orthogonal.

Sign-off gates: 1. Justin approves the schema and seen-twice rule (this doc). 2. forge_incident_log.py ships, three real lessons rewritten in the new schema as a smoke test. 3. Section 10.4 added to FORGE-DOCTRINE.md. 4. Eval checks register; first nightly run shows them passing or producing a sane backlog. 5. Auto-dream weekly nag fires once successfully (likely first Sunday after ship).

What I am NOT proposing

  • Auto-building guards from lessons. Premature codification risk; humans keep the judgment call.
  • Migrating existing 793 lines of LESSONS.md to the new schema. Grandfather them; only new entries follow.
  • A separate "incidents" database. LESSONS.md is the single store; the schema is structured prose, not a sqlite table. Easier to read, grep, and recover.
  • Replacing feedback_*.md topic files. Those capture user-preference corrections (style, voice, workflow). Lessons capture system-failure recurrences. Different surfaces, different cadence.

Risks

Risk Mitigation
Schema overhead discourages logging Keep all fields except id + title optional; the script fills tbd for the rest.
Lessons backlog grows faster than guards Auto-dream nag + orphan-recurrence eval check; the human gets surfaced backlog, not silent rot.
Bot logs every minor hiccup as an incident Doctrine wording: "issue surfaced" means user-visible failure, exception, wrong output, or regression. Not "I tried two approaches and the first did not compile."
seen_count upserts go wrong Script writes to a tempfile + atomic rename. Pre-commit eval check validates LESSONS.md parses cleanly.

Decisions locked (2026-04-30)

Justin delegated the backend judgment calls; defaults below are final unless the implementing worker hits a blocker.

  1. High-blast "customer-visible regression" narrowed to "customer-visible regression with no quick rollback." Broad version was too inclusive.
  2. Incident logging folded into forge_auto_memory.py as a log_incident() sub-module plus a thin forge_incident_log CLI shim. Single owner of LESSONS.md writes prevents two-writer races. No standalone forge_incident_log.py as originally drafted.
  3. Orphan-recurrence threshold: 2 occurrences over 7 days (was 3 over 14). Matches the seen-twice doctrine; longer windows let recurrences stew without action.
  4. Implementation order: auto-memory log_incident() extension → CLI shim → Section 10.4 doctrine clause → eval checks → auto-dream nag.

[Claude Code]