Files
claudetools/.claude/specs/wiki-layer/shape.md
Mike Swanson cd80f5e447 feat: add wiki knowledge layer (Phase 0 + Phase 1 seed)
Implements LLM-compiled wiki layer between raw session logs and live
CONTEXT.md, inspired by Karpathy's knowledge base workflow. Adds wiki/
directory structure, article templates, spec docs, and seeds first two
articles (Cascades of Tucson, GuruRMM) from 60+ session logs.

Updates CLAUDE.md to check wiki first on all context-loading triggers.
Captures verified ACG IP/hostname map and Neptune physical-location
clarification (Dataforth D2, subnet overlap TODO) in memory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-24 15:42:38 -07:00

71 lines
3.3 KiB
Markdown

# Wiki Layer — Shape Spec
**Project:** ClaudeTools Internal Tooling
**Author:** Mike Swanson
**Date:** 2026-05-24
**Status:** Specced, awaiting implementation
---
## Problem
Session logs accumulate as raw chronological records but are never synthesized. Every session start requires re-reading 5-15 logs to reconstruct institutional knowledge about a client, project, or system. This is:
1. Slow — GrepAI helps but synthesis still happens in-session
2. Lossy — patterns and connections that span multiple logs don't surface naturally
3. Non-durable — synthesized understanding lives only in the current context window
4. Howard-unfriendly — he can't benefit from what Mike's sessions discovered without reading raw logs
The coord API handles live state (locks, messages). CONTEXT.md handles live project state. Memory handles isolated facts. Nothing handles **synthesized knowledge** — the compiled, connected, queryable layer between raw logs and live state.
## Solution: A Wiki Layer
A `wiki/` directory of LLM-compiled Markdown articles organized by topic (clients, projects, systems, patterns). The LLM writes and maintains these articles from session logs. Humans rarely touch them directly.
Key properties:
- Plain `.md` files — no new tooling, syncs via Git, indexed by GrepAI automatically
- LLM-maintained — compiled from session logs, not written by hand
- Cross-linked — backlinks between clients, projects, systems
- Queryable — `/context` checks wiki first; much faster signal than raw logs
- Auditable — compilation metadata tracks what was read and when
## Analogy to Current System
| Layer | What It Is | Purpose |
|---|---|---|
| `session-logs/` | Raw chronological records | Audit trail, raw source of truth |
| `wiki/` | **[NEW]** LLM-compiled knowledge articles | Synthesized, queryable knowledge |
| `.claude/memory/` | Isolated discrete facts | Fast-access facts (links into wiki) |
| `CONTEXT.md` | Live project state | Current state, not knowledge |
| Coord API | Real-time inter-session comms | Locks, messages, component state |
## Scope
### In Scope (Phase 1)
- `wiki/` directory structure and templates
- `/wiki-compile` command — compiles session logs into wiki articles for a given scope
- `/context` update — checks wiki before raw logs
- Seed pass: GuruRMM + Cascades as pilot (most logs, highest value)
### In Scope (Phase 2)
- `/wiki-lint` command — health-checks wiki for stale IPs, broken backlinks, orphaned articles
- `/save` integration — prompt after save: "Update wiki for [detected topics]?"
- Additional seeds: all active clients, all known systems
### Out of Scope
- Obsidian or any special viewer — editors + GrepAI are sufficient
- RAG pipelines — flat files at our scale, GrepAI handles search
- Automated compilation without review — human spot-check required
- Finetuning on wiki content — not worth the complexity
- Replacing session logs, CONTEXT.md, memory, or coord API — additive only
## Success Criteria
- Starting work on GuruRMM or Cascades takes one wiki read instead of 10 log reads
- Howard can get full context on any active client by reading one file
- A stale IP or rate change gets caught by lint before causing a session failure
- `/context` returns synthesized answers in seconds, not minutes of log spelunking