feat: add wiki knowledge layer (Phase 0 + Phase 1 seed)

Implements LLM-compiled wiki layer between raw session logs and live CONTEXT.md, inspired by Karpathy's knowledge base workflow. Adds wiki/ directory structure, article templates, spec docs, and seeds first two articles (Cascades of Tucson, GuruRMM) from 60+ session logs. Updates CLAUDE.md to check wiki first on all context-loading triggers. Captures verified ACG IP/hostname map and Neptune physical-location clarification (Dataforth D2, subnet overlap TODO) in memory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-24 15:42:38 -07:00
parent 435e921300
commit cd80f5e447
19 changed files with 1533 additions and 6 deletions
--- a/.claude/specs/wiki-layer/shape.md
+++ b/.claude/specs/wiki-layer/shape.md
@@ -0,0 +1,70 @@
+# Wiki Layer — Shape Spec
+
+**Project:** ClaudeTools Internal Tooling
+**Author:** Mike Swanson
+**Date:** 2026-05-24
+**Status:** Specced, awaiting implementation
+
+---
+
+## Problem
+
+Session logs accumulate as raw chronological records but are never synthesized. Every session start requires re-reading 5-15 logs to reconstruct institutional knowledge about a client, project, or system. This is:
+
+1. Slow — GrepAI helps but synthesis still happens in-session
+2. Lossy — patterns and connections that span multiple logs don't surface naturally
+3. Non-durable — synthesized understanding lives only in the current context window
+4. Howard-unfriendly — he can't benefit from what Mike's sessions discovered without reading raw logs
+
+The coord API handles live state (locks, messages). CONTEXT.md handles live project state. Memory handles isolated facts. Nothing handles **synthesized knowledge** — the compiled, connected, queryable layer between raw logs and live state.
+
+## Solution: A Wiki Layer
+
+A `wiki/` directory of LLM-compiled Markdown articles organized by topic (clients, projects, systems, patterns). The LLM writes and maintains these articles from session logs. Humans rarely touch them directly.
+
+Key properties:
+- Plain `.md` files — no new tooling, syncs via Git, indexed by GrepAI automatically
+- LLM-maintained — compiled from session logs, not written by hand
+- Cross-linked — backlinks between clients, projects, systems
+- Queryable — `/context` checks wiki first; much faster signal than raw logs
+- Auditable — compilation metadata tracks what was read and when
+
+## Analogy to Current System
+
+| Layer | What It Is | Purpose |
+|---|---|---|
+| `session-logs/` | Raw chronological records | Audit trail, raw source of truth |
+| `wiki/` | **[NEW]** LLM-compiled knowledge articles | Synthesized, queryable knowledge |
+| `.claude/memory/` | Isolated discrete facts | Fast-access facts (links into wiki) |
+| `CONTEXT.md` | Live project state | Current state, not knowledge |
+| Coord API | Real-time inter-session comms | Locks, messages, component state |
+
+## Scope
+
+### In Scope (Phase 1)
+
+- `wiki/` directory structure and templates
+- `/wiki-compile` command — compiles session logs into wiki articles for a given scope
+- `/context` update — checks wiki before raw logs
+- Seed pass: GuruRMM + Cascades as pilot (most logs, highest value)
+
+### In Scope (Phase 2)
+
+- `/wiki-lint` command — health-checks wiki for stale IPs, broken backlinks, orphaned articles
+- `/save` integration — prompt after save: "Update wiki for [detected topics]?"
+- Additional seeds: all active clients, all known systems
+
+### Out of Scope
+
+- Obsidian or any special viewer — editors + GrepAI are sufficient
+- RAG pipelines — flat files at our scale, GrepAI handles search
+- Automated compilation without review — human spot-check required
+- Finetuning on wiki content — not worth the complexity
+- Replacing session logs, CONTEXT.md, memory, or coord API — additive only
+
+## Success Criteria
+
+- Starting work on GuruRMM or Cascades takes one wiki read instead of 10 log reads
+- Howard can get full context on any active client by reading one file
+- A stale IP or rate change gets caught by lint before causing a session failure
+- `/context` returns synthesized answers in seconds, not minutes of log spelunking