feat: session recovery toolset (orphan detector + /recover)

Reconstructs session logs from Claude Code transcripts when a session crashes or is closed before /save. Two entry points: - /recover <uuid|latest> : manual, Claude-reviewed reconstruction - detect_orphaned_sessions.py : scheduled scan that auto-builds logs for substantive, unsaved, not-yet-recovered transcripts (banner-marked RECOVERED-UNVERIFIED), commits them, and posts a #bot-alerts FYI. recover_session.py is the shared engine: Python extracts the verbatim command/config/reference timeline; Ollama drafts prose-only narrative. Machine-local ledger (.claude/state/) prevents reprocessing. Reviewed: git add scoped to own files, ledger written only after successful push, per-uuid idempotency, --max cap for unattended runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 18:33:07 -07:00
parent e8144a862e
commit eed3ece2c7
9 changed files with 1897 additions and 0 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -274,6 +274,7 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t
 | `/shape-spec` | Pre-implementation spec for a GuruRMM feature — produces plan.md, shape.md, references.md, standards.md |
 | `/rmm-audit` | Full end-to-end audit of GuruRMM: API coverage, UI gaps, Rust/TS quality, security, data integrity. Produces timestamped report + updates UI_GAPS.md |
 | `/forum-post` | Post a technical article to community.azcomputerguru.com — drafts from context, shows preview, inserts via paramiko SSH to Flarum DB |
+| `/recover` | Reconstruct a session log from a Claude Code transcript after a crash/close-before-save. `/recover <uuid>`, `/recover latest`, or `/recover --list`. See `.claude/RECOVERY.md` |

 ---

--- a/.claude/RECOVERY.md
+++ b/.claude/RECOVERY.md
@@ -0,0 +1,76 @@
+# Session Recovery
+
+Never lose work again when a Claude Code session crashes or is closed before `/save`.
+
+Claude Code writes every session live to a transcript JSONL. This toolset distills those transcripts back into normal session logs in the `.claude/commands/save.md` format.
+
+---
+
+## The three pieces
+
+| Piece | File | Role |
+|---|---|---|
+| Engine | `.claude/scripts/recover_session.py` | Parses one transcript, classifies it, and reconstructs a full session log. CLI: `--uuid` / `--latest` / `--path` with `--print` (default), `--auto`, or `--json`. |
+| Detector | `.claude/scripts/detect_orphaned_sessions.py` | Scans all idle transcripts, auto-recovers the orphans (substantive + unsaved), updates the ledger, commits + pushes, and posts an FYI to `#bot-alerts`. CLI: `--dry-run`, `--idle-min N`, `--no-commit`, `--no-alert`. |
+| Command | `.claude/commands/recover.md` | `/recover <uuid>` / `/recover latest` / `/recover --list` — the **manual, reviewed** path where Claude edits the draft before writing. |
+
+The scheduled-task registration script `.claude/scripts/register-orphan-detector.ps1` wires the detector into the Windows Task Scheduler (Windows only).
+
+---
+
+## Where things live
+
+- **Transcripts:** `~/.claude/projects/<slug>/<uuid>.jsonl`, where `<slug>` is the claudetools repo root with `/`, `\`, and `:` each replaced by `-`. On a `D:\claudetools` machine the slug is `D--claudetools`, so `C:\Users\<you>\.claude\projects\D--claudetools\*.jsonl`. The slug is computed portably from `claudetools_root` in `.claude/identity.json`. Sibling `<uuid>/` dirs hold subagent transcripts and are ignored for the main narrative.
+- **Ledger:** `.claude/state/recovered-sessions.json` (machine-local, gitignored). Records every processed uuid with its verdict (`recovered` / `skipped-saved` / `skipped-trivial` / `error`) so it is never re-scanned. Transcripts are per-machine, so the ledger is too.
+
+---
+
+## How to run
+
+```bash
+# See candidate orphans without writing anything:
+py .claude/scripts/detect_orphaned_sessions.py --dry-run
+
+# Inspect one transcript's verdict as JSON (writes nothing):
+py .claude/scripts/recover_session.py --json --uuid <uuid>
+
+# Print a reconstructed log to stdout (writes nothing):
+py .claude/scripts/recover_session.py --uuid <uuid> --print
+
+# Full unattended run (writes logs, updates ledger, commits, pushes, alerts):
+py .claude/scripts/detect_orphaned_sessions.py
+```
+
+### Register the scheduled task (Windows)
+
+```powershell
+powershell -ExecutionPolicy Bypass -File D:\claudetools\.claude\scripts\register-orphan-detector.ps1
+```
+
+Registers `ClaudeTools - Orphaned Session Detector`: runs at logon and every 4 hours. The 4-hour cadence pairs with the detector's 90-minute idle gate so an active session is never grabbed mid-flight.
+
+---
+
+## Accuracy split: Ollama prose vs Python verbatim
+
+This is the core design principle.
+
+- **Ollama drafts prose only** — Session Summary, Key Decisions, Problems Encountered, Pending / Incomplete Tasks. It never sees and never emits commands, IPs, credentials, file paths, commit SHAs, or ticket IDs. If Ollama is unreachable the log is still produced with a placeholder note in the prose sections.
+- **Python extracts the verbatim evidence** — Configuration Changes (Write/Edit/NotebookEdit targets), Commands & Outputs (mutating Bash/PowerShell with truncated results), Reference Information (regex-extracted SHAs, URLs, IPs, ticket numbers, coord message ids), and Infrastructure & Servers. This is the high-value, accuracy-critical part and it comes straight from the transcript.
+
+Trust the verbatim sections for facts; treat the prose as a draft.
+
+---
+
+## Classification
+
+- **substantive** — the session did real work: a Write/Edit/NotebookEdit, a mutating Bash/PowerShell command (git commit/push/add, ssh, schtasks, New-Item, Set-Content, Remove-Item, Out-File, a POST/PUT/DELETE/PATCH curl, an `/api/` call, `vault.sh`, a mutating Invoke-RestMethod), or a mutating Skill (syncro, rmm, remediation-tool, mailbox, forum-post, syncro-emergency-billing).
+- **saved** — the session was already saved: a save/scc/checkpoint Skill, or a Write/Edit into a `session-logs/` path.
+- **orphan** = substantive AND not saved. Only orphans are auto-recovered.
+- **scope** — client / project / general, decided by Python from the transcript text, `cwd`, and `gitBranch` against the known client and project slugs. Conservative: ambiguous resolves to `general`.
+
+---
+
+## Banner discipline
+
+Auto-recovered logs are written with a `[RECOVERED -- UNVERIFIED]` banner. **The banner stays until a human reviews the log** and removes it. The manual `/recover` path lets Claude review and correct the draft before writing, and drops the banner once verified.
--- a/.claude/commands/recover.md
+++ b/.claude/commands/recover.md
@@ -0,0 +1,84 @@
+Reconstruct a session log from a Claude Code transcript when a session crashed or was closed before `/save`.
+
+Claude Code writes every session live to a transcript JSONL under `~/.claude/projects/<slug>/<uuid>.jsonl`. `/recover` distills one of those transcripts back into a normal session log in the `.claude/commands/save.md` format. This is the **manual, reviewed** path; the background detector (`detect_orphaned_sessions.py`) handles unattended auto-recovery.
+
+---
+
+## Usage
+
+| Invocation | Action |
+|---|---|
+| `/recover <uuid>` | Reconstruct the session with that transcript uuid |
+| `/recover latest` | Reconstruct the newest transcript by mtime |
+| `/recover --list` | Show candidate orphans (runs the detector `--dry-run`) |
+
+---
+
+## Flow: `/recover --list`
+
+Run the detector in scan-only mode and present the table to the user:
+
+```bash
+py .claude/scripts/detect_orphaned_sessions.py --dry-run
+```
+
+The table shows every past-idle, not-yet-processed transcript with its uuid, mtime, `substantive`/`saved`/`orphan` verdicts, classified scope, and the path a recovery would write to. Point the user at the rows where `orphan` is `YES` — those are unsaved substantive sessions. Nothing is written.
+
+---
+
+## Flow: `/recover <uuid>` or `/recover latest`
+
+This is a **reviewed** recovery. Claude is the editor, not a passive writer.
+
+1. **Generate the draft** (prints to stdout, writes nothing):
+
+   ```bash
+   py .claude/scripts/recover_session.py --uuid <uuid> --print
+   ```
+
+   (or `--latest`). The draft contains:
+   - Ollama-drafted prose: Session Summary, Key Decisions, Problems Encountered, Pending / Incomplete Tasks.
+   - Python-extracted verbatim evidence: Configuration Changes, Commands & Outputs, Reference Information, Infrastructure & Servers, Credentials & Secrets.
+   - A `[RECOVERED -- UNVERIFIED]` banner and the canonical User block (from `whoami-block.sh`).
+
+2. **Review the draft.** This is the point of the manual path:
+   - Verify the **Commands / Config / Reference** appendix matches what actually happened and what the user intended. These are machine-extracted verbatim — confirm they are complete and not misleading.
+   - Correct the **scope and slug**: the classifier is conservative and may land on `general` (or the wrong project/client) when work spanned several areas. Fix the target `session-logs/` directory accordingly.
+   - Tighten the **topic** in the filename and the title.
+   - Correct or rewrite the **Ollama prose** where it is imprecise. If Ollama was unreachable, write the prose sections yourself from the verbatim evidence.
+
+3. **Write the final log.** Once verified, write the corrected markdown to the correct `session-logs/` path (client -> `clients/<slug>/session-logs/`, project -> `projects/<project>/session-logs/`, general -> root `session-logs/`), using the transcript's first-timestamp date: `YYYY-MM-DD-recovered-<topic>.md`. **Drop the UNVERIFIED banner** — by writing it yourself you have verified it.
+
+4. **Sync:**
+
+   ```bash
+   bash .claude/scripts/sync.sh
+   ```
+
+5. **Unseeded wiki check.** If the scope is a client or project with no `wiki/<type>/<slug>.md` article yet, suggest:
+
+   ```
+   [INFO] No wiki article for '<slug>' yet. Run /wiki-compile <type>:<slug> to seed it.
+   ```
+
+---
+
+## Difference from the automatic detector
+
+| | `/recover` (this command) | `detect_orphaned_sessions.py` (background) |
+|---|---|---|
+| Trigger | Manual, on demand | Scheduled task (every few hours + at logon) |
+| Review | Claude reviews and corrects before writing | None — auto-writes unreviewed |
+| Banner | Removed once verified | Kept (`[RECOVERED -- UNVERIFIED]`) until a human reviews |
+| Scope/topic | Corrected by Claude | Whatever the classifier decided |
+| Output | Final, clean session log | Banner-marked draft committed for later review |
+
+Use `/recover` when you know a specific session was lost and want a clean log. Let the detector catch the ones you forget.
+
+---
+
+## Notes
+
+- `--auto` and `--json` modes on `recover_session.py` exist for the detector and for scripting; `/recover` uses `--print` so Claude always reviews before anything lands on disk.
+- The prose is Ollama-drafted from the transcript; the Commands/Config/Reference sections are extracted verbatim by Python. Never trust the prose for exact commands, IPs, credentials, paths, SHAs, or ticket IDs — read those from the verbatim sections.
+- Transcripts are per-machine. You can only recover sessions that ran on the machine you are on.
--- a/.claude/memory/MEMORY.md
+++ b/.claude/memory/MEMORY.md
@@ -52,6 +52,7 @@
 - [Add Mike as owner on all Entra apps](feedback_entra_app_owner.md) — Apps created via management SP have no user owner — must add Mike manually or publisher verification fails.
 - [No TOML/config file approach for endpoints](feedback_no_toml_config_endpoints.md) — User explicitly prohibits TOML or config-file-based endpoint configuration — this will never be approved.
 - [Python on Windows — use py launcher](feedback_python_windows.md) — Windows Store python/python3 aliases disabled; always use py or jq on DESKTOP-0O8A1RL.
+- [Unsaved sessions are recoverable from transcripts](feedback_session_recovery.md) — Crashed/closed-before-save sessions live in `~/.claude/projects/<slug>/*.jsonl`; the detector auto-recovers orphans, `/recover <uuid>` does it manually. Ollama prose + Python verbatim. See `.claude/RECOVERY.md`.

 ### Syncro
 - [Syncro API plumbing](feedback_syncro_api.md) — Content-Type required on all POST/PUT; NO idempotency anywhere — always GET before retrying; response wrappers (`.ticket.id`, `.comment.id`); add_line_item shape (internal ID, flat response, required fields); HTML uses `<br>` not `<ul>/<li>`; timer_entry response is FLAT but SUPERSEDED (use add_line_item).
--- a/.claude/memory/feedback_session_recovery.md
+++ b/.claude/memory/feedback_session_recovery.md
@@ -0,0 +1,19 @@
+---
+name: Unsaved sessions are recoverable from transcripts
+description: Claude Code transcripts let you rebuild a session log after a crash/close-before-save; a detector auto-recovers orphans and /recover does it manually
+type: feedback
+---
+
+Claude Code writes every session live to a transcript JSONL at `~/.claude/projects/<slug>/<uuid>.jsonl` (slug = the claudetools repo root with `/`, `\`, and `:` each replaced by `-`; computed from `claudetools_root` in identity.json). A session closed or crashed before `/save` is NOT lost — the work is fully recorded in that transcript and can be distilled back into a normal session log.
+
+Toolset (`.claude/RECOVERY.md`):
+- `.claude/scripts/recover_session.py` — engine. `--uuid`/`--latest`/`--path` with `--print`/`--auto`/`--json`.
+- `.claude/scripts/detect_orphaned_sessions.py` — scans idle transcripts, auto-recovers orphans (substantive AND not saved), commits + pushes, FYIs `#bot-alerts`. `--dry-run` to scan only. Ledger at `.claude/state/recovered-sessions.json` (machine-local).
+- `/recover <uuid>` — manual reviewed path; Claude corrects the draft before writing.
+- `.claude/scripts/register-orphan-detector.ps1` — registers the scheduled task (Windows).
+
+Accuracy split: Ollama drafts ONLY prose (summary/decisions/problems/pending); Python extracts commands, file paths, IPs, SHAs, tickets verbatim. Auto-recovered logs carry a `[RECOVERED -- UNVERIFIED]` banner until a human reviews them.
+
+**Why:** Mike wanted to never lose work to a crashed/unclosed session again. Manual `/save` is the only thing that wrote logs before; the transcript is a complete fallback record.
+
+**How to apply:** If a user says a session crashed or work was lost, run `py .claude/scripts/detect_orphaned_sessions.py --dry-run` to find candidate orphans, then `/recover <uuid>` to reconstruct and review a clean log. Don't assume work is gone — check the transcripts first.
--- a/.claude/scripts/detect_orphaned_sessions.py
+++ b/.claude/scripts/detect_orphaned_sessions.py
@@ -0,0 +1,431 @@
+#!/usr/bin/env python3
+"""detect_orphaned_sessions.py -- find and auto-recover unsaved Claude Code sessions.
+
+A session is "orphaned" when its transcript records substantive (mutating) work
+but the session was never saved (no /save, /scc, or /checkpoint, and no write into
+a session-logs/ path). This script scans the per-machine transcript directory,
+classifies each idle transcript via the recover_session engine, auto-builds a
+banner-marked recovery log for each orphan, records every processed uuid in a
+machine-local ledger so it is never re-scanned, commits + pushes the recovered
+logs, and posts an FYI to #bot-alerts.
+
+Modes:
+  (default)        full run: build logs, update ledger, commit, push, alert
+  --dry-run        scan + print a report table; write/commit/alert nothing
+  --idle-min N     minutes of mtime-idle before a transcript is eligible (default 90)
+  --no-commit      build + ledger, but skip git commit/push
+  --no-alert       build + ledger + commit, but skip the Discord alert
+
+The detector NEVER touches sync.sh; it does its own git add/commit/push so it has
+no surprising side effects. Soft-fails on git/alert errors (work is already saved
+to disk -- those are best-effort).
+
+stdlib only; targets Python 3.11+.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import shutil
+import subprocess
+import sys
+from datetime import datetime, timezone
+from pathlib import Path
+
+# Import the shared engine (same directory).
+sys.path.insert(0, str(Path(__file__).resolve().parent))
+import recover_session as engine  # noqa: E402
+
+
+LEDGER_REL = Path(".claude") / "state" / "recovered-sessions.json"
+
+
+def _now_iso() -> str:
+    return datetime.now(timezone.utc).isoformat()
+
+
+def ledger_path() -> Path:
+    return engine.repo_root() / LEDGER_REL
+
+
+def load_ledger() -> dict:
+    p = ledger_path()
+    if p.exists():
+        try:
+            return json.loads(p.read_text(encoding="utf-8"))
+        except (OSError, ValueError):
+            return {}
+    return {}
+
+
+def save_ledger(ledger: dict) -> None:
+    p = ledger_path()
+    p.parent.mkdir(parents=True, exist_ok=True)
+    p.write_text(json.dumps(ledger, indent=2, ensure_ascii=False) + "\n", encoding="utf-8")
+
+
+def _scope_str(scope: dict) -> str:
+    t = scope.get("type", "general")
+    if t == "general":
+        return "general"
+    return f"{t}:{scope.get('slug', '?')}"
+
+
+def scan(idle_min: int, ledger: dict) -> tuple[list[dict], list[dict]]:
+    """Scan transcripts.
+
+    Returns (eligible, recoverable):
+      eligible    -- every transcript that is past idle and not already in ledger
+                     (each a dict with parsed metadata + verdict fields)
+      recoverable -- the subset that are orphans (substantive and not saved)
+    """
+    base = engine.transcript_base_dir()
+    now = datetime.now().timestamp()
+    idle_secs = idle_min * 60
+
+    eligible: list[dict] = []
+    recoverable: list[dict] = []
+
+    if not base.is_dir():
+        return eligible, recoverable
+
+    for jf in sorted(base.glob("*.jsonl")):
+        uuid = jf.stem
+        try:
+            mtime = jf.stat().st_mtime
+        except OSError:
+            continue
+        # Skip recently-active sessions.
+        if (now - mtime) < idle_secs:
+            continue
+        # Skip anything already processed.
+        if uuid in ledger:
+            continue
+
+        parsed = engine.parse_transcript(jf)
+        verdict = engine.classify(parsed)
+        orphan = bool(verdict["substantive"] and not verdict["saved"])
+        rec = {
+            "uuid": uuid,
+            "path": jf,
+            "mtime": mtime,
+            "substantive": verdict["substantive"],
+            "saved": verdict["saved"],
+            "orphan": orphan,
+            "scope": verdict["scope"],
+            "title": verdict["title"],
+            "parsed": parsed,
+        }
+        # would-write path (metadata-cheap; no Ollama)
+        rec["would_write"] = str(
+            engine.compute_output_path(parsed, verdict["scope"], verdict["title"])
+        )
+        eligible.append(rec)
+        if orphan:
+            recoverable.append(rec)
+
+    # Process OLDEST-FIRST so a capped run drains the longest-waiting orphans
+    # first. Prefer the transcript's first_ts when available; fall back to mtime.
+    def _age_key(r: dict):
+        ts = (r.get("parsed").first_ts if r.get("parsed") else "") or ""
+        if ts:
+            try:
+                return datetime.fromisoformat(ts.replace("Z", "+00:00")).timestamp()
+            except ValueError:
+                pass
+        return r.get("mtime", 0.0)
+
+    eligible.sort(key=_age_key)
+    recoverable.sort(key=_age_key)
+
+    return eligible, recoverable
+
+
+def print_dry_run_table(eligible: list[dict]) -> None:
+    if not eligible:
+        print("[INFO] No eligible (past-idle, unprocessed) transcripts found.")
+        return
+    headers = ["uuid", "mtime", "subst", "saved", "orphan", "scope", "would-write-path"]
+    rows = []
+    for r in eligible:
+        mt = datetime.fromtimestamp(r["mtime"]).strftime("%Y-%m-%d %H:%M")
+        rows.append(
+            [
+                r["uuid"][:8],
+                mt,
+                "yes" if r["substantive"] else "no",
+                "yes" if r["saved"] else "no",
+                "YES" if r["orphan"] else "no",
+                _scope_str(r["scope"]),
+                r["would_write"],
+            ]
+        )
+    widths = [len(h) for h in headers]
+    for row in rows:
+        for i, cell in enumerate(row):
+            widths[i] = max(widths[i], len(str(cell)))
+    fmt = "  ".join("{:<" + str(w) + "}" for w in widths)
+    print(fmt.format(*headers))
+    print(fmt.format(*["-" * w for w in widths]))
+    for row in rows:
+        print(fmt.format(*[str(c) for c in row]))
+    n_orphan = sum(1 for r in eligible if r["orphan"])
+    print()
+    print(f"[INFO] {len(eligible)} eligible, {n_orphan} orphan(s) would be recovered.")
+
+
+def _existing_recovered_for_uuid(out_dir: Path, uuid: str) -> Path | None:
+    """Return a prior recovered log for THIS uuid in ``out_dir``, if one exists.
+
+    The tool's own collision filename embeds the 8-char uuid prefix as a trailing
+    ``-recovered-...-<short>.md`` suffix (see ``compute_output_path``). Matching on
+    that prefix lets a re-run overwrite its OWN prior draft for the same uuid in
+    place -- the one safe overwrite -- instead of minting a second suffixed copy.
+
+    Only files that are clearly recovered drafts (``-recovered-`` in the name AND
+    ending in ``-<short>.md``) are considered. A genuine non-recovered human log
+    will never match, so its suffix protection is preserved.
+    """
+    if not out_dir.is_dir():
+        return None
+    short = uuid[:8]
+    suffix = f"-{short}.md"
+    for f in out_dir.glob(f"*-recovered-*{suffix}"):
+        if f.is_file() and f.name.endswith(suffix):
+            return f
+    return None
+
+
+def recover_one(rec: dict) -> str:
+    """Build + write the recovery log for one orphan. Returns the written path.
+
+    Idempotent per-uuid: if a prior recovered draft for THIS uuid already exists
+    in the target directory (a run that died after writing but before the ledger
+    was updated), overwrite that same file in place rather than creating a new
+    suffixed copy. Never overwrites a non-recovered human log.
+    """
+    parsed = rec["parsed"]
+    markdown, meta = engine.build_log(parsed)
+    out_path = Path(meta["path_would_be"])
+    prior = _existing_recovered_for_uuid(out_path.parent, rec["uuid"])
+    if prior is not None:
+        out_path = prior
+    out_path.parent.mkdir(parents=True, exist_ok=True)
+    out_path.write_text(markdown, encoding="utf-8")
+    rec["written"] = str(out_path)
+    rec["date"] = meta["date"]
+    return str(out_path)
+
+
+def git(*args: str) -> subprocess.CompletedProcess:
+    return subprocess.run(
+        ["git", *args],
+        cwd=str(engine.repo_root()),
+        capture_output=True,
+        text=True,
+        timeout=120,
+    )
+
+
+def _current_branch() -> str:
+    """Return the current git branch name, or empty string if undeterminable."""
+    res = git("rev-parse", "--abbrev-ref", "HEAD")
+    if res.returncode == 0:
+        name = res.stdout.strip()
+        if name and name != "HEAD":
+            return name
+    return ""
+
+
+def commit_and_push(written_paths: list[str], count: int) -> bool:
+    """Stage only the recovered logs, commit, push. Soft-fail on errors.
+
+    NEVER stages the ledger -- it is machine-local and correctly gitignored;
+    appending it to ``git add`` aborts the whole add (exit 1) and stages nothing.
+
+    Returns True only when BOTH the commit AND the push succeed. On any failure
+    returns False so the caller knows not to mark these uuids ``recovered`` (the
+    next run must re-attempt them).
+    """
+    root = engine.repo_root()
+    rel_paths = []
+    for p in written_paths:
+        try:
+            rel_paths.append(str(Path(p).resolve().relative_to(root)))
+        except ValueError:
+            rel_paths.append(p)
+
+    add = git("add", "--", *rel_paths)
+    if add.returncode != 0:
+        print(f"[WARNING] git add failed; logs are on disk but uncommitted: {add.stderr.strip()}", file=sys.stderr)
+        return False
+
+    msg = (
+        f"chore: auto-recover {count} unsaved session log(s)\n\n"
+        f"{engine._COMMIT_FOOTER}"
+    )
+    commit = git("commit", "-m", msg)
+    if commit.returncode != 0:
+        # Nothing to commit, or hook failure -- soft-fail.
+        print(f"[WARNING] git commit returned non-zero: {commit.stdout.strip()} {commit.stderr.strip()}", file=sys.stderr)
+        return False
+    print(f"[OK] committed {count} recovered log(s).")
+
+    branch = _current_branch()
+    if branch:
+        push = git("push", "origin", branch)
+    else:
+        push = git("push")
+    if push.returncode != 0:
+        target = f"origin {branch}" if branch else "origin"
+        print(
+            f"[WARNING] git push to {target} failed (commit is local): {push.stderr.strip()}",
+            file=sys.stderr,
+        )
+        return False
+    print(f"[OK] pushed to origin{(' ' + branch) if branch else ''}.")
+    return True
+
+
+def post_alert(recovered: list[dict]) -> None:
+    """Post an FYI to #bot-alerts via post-bot-alert.sh. Soft-fail."""
+    script = engine.repo_root() / ".claude" / "scripts" / "post-bot-alert.sh"
+    if not script.exists():
+        print("[WARNING] post-bot-alert.sh not found; alert skipped.", file=sys.stderr)
+        return
+    bash = shutil.which("bash")
+    if not bash:
+        print(
+            "[WARNING] 'bash' not found on PATH (restricted scheduler env?); "
+            "#bot-alerts FYI skipped. Recovered logs are already committed.",
+            file=sys.stderr,
+        )
+        return
+    lines = [
+        f"[INFO] Auto-recovered {len(recovered)} unsaved session log(s) -- "
+        f"already saved to the repo; FYI, please review and remove the UNVERIFIED banner:"
+    ]
+    for r in recovered:
+        lines.append(
+            f"- {r['uuid'][:8]} | {r.get('date', '?')} | {_scope_str(r['scope'])} | {r.get('written', '?')}"
+        )
+    message = "\n".join(lines)
+    try:
+        res = subprocess.run(
+            [bash, str(script), message, "bot"],
+            cwd=str(engine.repo_root()),
+            capture_output=True,
+            text=True,
+            timeout=30,
+        )
+        out = (res.stdout or "").strip() or (res.stderr or "").strip()
+        if out:
+            print(out)
+    except (OSError, subprocess.SubprocessError) as e:
+        print(f"[WARNING] alert post failed: {e}", file=sys.stderr)
+
+
+def main(argv: list[str] | None = None) -> int:
+    # Force UTF-8 stdout (Windows console defaults to cp1252; titles/paths in
+    # the dry-run table can contain characters outside that codepage).
+    try:
+        sys.stdout.reconfigure(encoding="utf-8", errors="replace")
+    except (AttributeError, ValueError):
+        pass
+
+    parser = argparse.ArgumentParser(
+        description="Detect and auto-recover unsaved Claude Code sessions."
+    )
+    parser.add_argument("--dry-run", action="store_true", help="scan + print report; no writes/commit/alert")
+    parser.add_argument("--idle-min", type=int, default=90, help="minutes of mtime-idle before eligible (default 90)")
+    parser.add_argument("--max", type=int, default=25, dest="max_recover", help="max orphan logs to build per run, oldest-first (default 25)")
+    parser.add_argument("--no-commit", action="store_true", help="skip git commit/push")
+    parser.add_argument("--no-alert", action="store_true", help="skip the Discord alert")
+    args = parser.parse_args(argv)
+
+    # Respect the ledger in both modes (dry-run still skips already-processed).
+    ledger = load_ledger()
+
+    eligible, recoverable = scan(args.idle_min, ledger)
+
+    if args.dry_run:
+        print_dry_run_table(eligible)
+        return 0
+
+    if not eligible:
+        print("[INFO] No eligible transcripts to process.")
+        return 0
+
+    written_paths: list[str] = []
+    recovered_recs: list[dict] = []
+    deferred = 0
+    built = 0
+
+    for rec in eligible:
+        uuid = rec["uuid"]
+        if rec["orphan"]:
+            # Cap actual log-builds per run (oldest-first). Remaining orphans are
+            # left OUT of the ledger so the next run re-attempts them.
+            if built >= args.max_recover:
+                deferred += 1
+                continue
+            try:
+                path = recover_one(rec)
+            except Exception as e:  # noqa: BLE001 -- never let one bad transcript abort the run
+                print(f"[WARNING] failed to recover {uuid[:8]}: {e}", file=sys.stderr)
+                # No on-disk artifact -> safe to mark immediately.
+                ledger[uuid] = {"verdict": "error", "at": _now_iso(), "path": None, "error": str(e)}
+                continue
+            built += 1
+            written_paths.append(path)
+            recovered_recs.append(rec)
+            print(f"[OK] recovered {uuid[:8]} -> {path}")
+        elif rec["saved"]:
+            # No on-disk artifact -> safe to mark immediately.
+            ledger[uuid] = {"verdict": "skipped-saved", "at": _now_iso(), "path": None}
+        else:
+            ledger[uuid] = {"verdict": "skipped-trivial", "at": _now_iso(), "path": None}
+
+    if deferred:
+        print(f"[INFO] {deferred} more orphan(s) deferred to next run (--max {args.max_recover}).")
+
+    # Persist the skipped/error verdicts now (they have no artifact, so they are
+    # safe regardless of the commit/push outcome below).
+    save_ledger(ledger)
+
+    if not recovered_recs:
+        print("[INFO] No orphans recovered (all eligible sessions were saved or trivial).")
+        return 0
+
+    if not args.no_commit:
+        pushed = commit_and_push(written_paths, len(recovered_recs))
+        if pushed:
+            # H1: only mark uuids 'recovered' AFTER a successful commit+push, so a
+            # push failure leaves them out of the ledger for the next run to retry.
+            for rec in recovered_recs:
+                ledger[rec["uuid"]] = {
+                    "verdict": "recovered",
+                    "at": _now_iso(),
+                    "path": rec.get("written"),
+                }
+            save_ledger(ledger)
+        else:
+            print(
+                "[WARNING] commit/push did not succeed; recovered uuids left UNLEDGERED "
+                "so the next run re-attempts them (logs are on disk).",
+                file=sys.stderr,
+            )
+    else:
+        print("[INFO] --no-commit set; recovered logs left unstaged and UNLEDGERED (next run will re-attempt).")
+
+    if not args.no_alert:
+        post_alert(recovered_recs)
+    else:
+        print("[INFO] --no-alert set; Discord alert skipped.")
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/.claude/scripts/recover_session.py
+++ b/.claude/scripts/recover_session.py
--- a/.claude/scripts/register-orphan-detector.ps1
+++ b/.claude/scripts/register-orphan-detector.ps1
@@ -0,0 +1,95 @@
+# register-orphan-detector.ps1
+# Register the "ClaudeTools - Orphaned Session Detector" scheduled task on this
+# Windows machine. The task runs detect_orphaned_sessions.py, which scans the
+# per-machine Claude Code transcript directory for unsaved substantive sessions,
+# auto-builds banner-marked recovery logs, commits + pushes them, and posts an
+# FYI to #bot-alerts.
+#
+# Mirrors the GrepAI watcher registration pattern in .claude/OLLAMA.md.
+#
+# Triggers:
+#   - AtLogOn (catch sessions lost since the last logon)
+#   - Daily, repeating every 4 hours (catch crashes during a long workday;
+#     4h cadence pairs with the detector's 90-minute idle gate so an active
+#     session is never grabbed mid-flight)
+#
+# Idempotent: -Force replaces any existing task with the same name.
+# This script only REGISTERS the task. It does not run the detector now.
+#
+# Run from an ordinary (non-admin) PowerShell:
+#   powershell -ExecutionPolicy Bypass -File D:\claudetools\.claude\scripts\register-orphan-detector.ps1
+
+$ErrorActionPreference = "Stop"
+
+$TaskName    = "ClaudeTools - Orphaned Session Detector"
+
+# Resolve the repo root portably. Prefer claudetools_root from identity.json
+# (per-machine, gitignored); fall back to two levels up from this script
+# (.claude/scripts/ -> repo root), resolved to a full path.
+$ScriptDir   = $PSScriptRoot
+$FallbackRoot = (Resolve-Path (Join-Path $ScriptDir "..\..")).Path
+$IdentityPath = Join-Path $FallbackRoot ".claude\identity.json"
+$RepoRoot    = $FallbackRoot
+if (Test-Path $IdentityPath) {
+    try {
+        $identity = Get-Content -Raw -Path $IdentityPath | ConvertFrom-Json
+        if ($identity.claudetools_root -and (Test-Path $identity.claudetools_root)) {
+            $RepoRoot = (Resolve-Path $identity.claudetools_root).Path
+        }
+    } catch {
+        Write-Host "[WARNING] Could not parse $IdentityPath; using $FallbackRoot" -ForegroundColor Yellow
+    }
+}
+$Script      = Join-Path $RepoRoot ".claude\scripts\detect_orphaned_sessions.py"
+
+if (-not (Test-Path $Script)) {
+    Write-Host "[ERROR] Detector not found at $Script" -ForegroundColor Red
+    exit 1
+}
+
+# Resolve the py launcher's full path (the action's Execute wants an absolute
+# path; "py" alone usually resolves but we pin it for reliability under the
+# Task Scheduler's environment).
+$PyCmd = Get-Command py -ErrorAction SilentlyContinue
+if ($null -ne $PyCmd) {
+    $PyPath = $PyCmd.Source
+} else {
+    $PyPath = "py"  # fall back to PATH resolution at run time
+}
+
+$Action = New-ScheduledTaskAction `
+    -Execute $PyPath `
+    -Argument "`"$Script`"" `
+    -WorkingDirectory $RepoRoot
+
+# Trigger 1: at logon for the current user.
+$TriggerLogon = New-ScheduledTaskTrigger -AtLogOn -User $env:USERNAME
+
+# Trigger 2: daily at a fixed start, repeating every 4 hours all day.
+$TriggerDaily = New-ScheduledTaskTrigger -Daily -At 9am
+$TriggerDaily.Repetition = (New-ScheduledTaskTrigger `
+    -Once -At 9am `
+    -RepetitionInterval (New-TimeSpan -Hours 4) `
+    -RepetitionDuration (New-TimeSpan -Hours 24)).Repetition
+
+$Settings = New-ScheduledTaskSettingsSet `
+    -ExecutionTimeLimit (New-TimeSpan -Minutes 30) `
+    -MultipleInstances IgnoreNew `
+    -StartWhenAvailable `
+    -DontStopOnIdleEnd
+
+Register-ScheduledTask `
+    -TaskName $TaskName `
+    -Action $Action `
+    -Trigger $TriggerLogon, $TriggerDaily `
+    -Settings $Settings `
+    -Description "Scans Claude Code transcripts for unsaved substantive sessions and auto-recovers them into session logs." `
+    -Force | Out-Null
+
+Write-Host "[OK] Registered scheduled task '$TaskName'."
+Write-Host "[INFO] Action:   $PyPath `"$Script`""
+Write-Host "[INFO] WorkDir:  $RepoRoot"
+Write-Host "[INFO] Triggers: AtLogOn ($env:USERNAME) + daily every 4h"
+Write-Host "[INFO] To inspect:  Get-ScheduledTask -TaskName '$TaskName' | Format-List"
+Write-Host "[INFO] To run now:  Start-ScheduledTask -TaskName '$TaskName'"
+Write-Host "[INFO] To remove:   Unregister-ScheduledTask -TaskName '$TaskName' -Confirm:`$false"