diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 23997cc..8c305b9 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -241,11 +241,13 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t ## Local AI (Ollama) -Tier 0 — use before spawning Haiku agents for low-stakes tasks (free, fast, private): +Tier 0 — **Ollama is the documentation engine.** Route all prose generation through it: session log narratives, commit messages, ticket comments, client notes, code docs. Claude reviews output, owns credentials/facts/execution. + - **DESKTOP-0O8A1RL:** `http://localhost:11434` - **Other machines:** `http://100.92.127.64:11434` (Tailscale required) -- **Models:** `qwen3:14b` (summarize/classify/draft), `codestral:22b` (code suggestions — always review) -- **Full reference:** `.claude/OLLAMA.md` (connection examples, model selection, review policy) +- **Models:** `qwen3:14b` (all documentation/prose), `codestral:22b` (code suggestions — always review) +- **Warm-start:** GrepAI keeps the Ollama service running; qwen3 VRAM swap is ~5s worst case, not 50s +- **Full reference:** `.claude/OLLAMA.md` (documentation engine scope, model selection, review policy) ### GrepAI (Semantic Code Search) diff --git a/.claude/OLLAMA.md b/.claude/OLLAMA.md index 22dddaa..f2cb7e5 100644 --- a/.claude/OLLAMA.md +++ b/.claude/OLLAMA.md @@ -70,19 +70,61 @@ For code suggestions, swap `qwen3:14b` for `codestral:22b`. Codestral doesn't ne Cold-start is ~30-50s on first call per model per session. Warm calls are 1-5s. +## Documentation Engine + +**Ollama is the default documentation engine for all prose output.** Any time stored text needs to be generated — session logs, commit messages, ticket comments, client notes, code docs — route it through Ollama first. Claude reviews, corrects if needed, then writes or posts. + +This keeps Claude tokens focused on reasoning, decisions, and execution. Ollama handles the writing. + +### What Ollama owns + +| Output | Model | Claude's role | +|--------|-------|---------------| +| Session log narrative (summary, decisions, problems) | qwen3:14b | Review + assemble with factual sections | +| Commit message body | qwen3:14b | Review + execute git commit | +| Syncro comment bodies + billing descriptions | qwen3:14b | Review checklist + post via API | +| Ticket initial issue / description text | qwen3:14b | Review + post | +| Client-facing notes and summaries | qwen3:14b | Review for accuracy | +| Code comments and docstrings | codestral:22b | Review before applying | +| Refactor suggestions | codestral:22b | Review before applying | + +### What Claude always owns (never Ollama) + +- Credentials, passwords, API keys — must be verbatim accurate +- Infrastructure details, IPs, hostnames — must be verbatim accurate +- Command outputs and error messages — verbatim from actual output +- Security decisions, auth review, production migrations +- Final field values on API payloads (rates, IDs, quantities) + +### Warm-start and GrepAI + +GrepAI uses `nomic-embed-text` for context lookups, which keeps the Ollama **service** running continuously. The 30-50s service cold-start is effectively eliminated in normal workflow. `qwen3:14b` may take ~5s to swap into VRAM if it hasn't been called recently, but that's the worst case — not 50s. + +If the first Ollama call of a session needs to be fast, send a throwaway warm-up ping: +```bash +py -c " +import urllib.request, json +body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':'ok'}],'stream':False,'think':False}).encode() +urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=60).read() +print('warm') +" +``` + ## When to Use Which Model | Task | Model | |------|-------| -| Summarize logs, diffs, session notes | qwen3:14b | +| Session log narrative sections | qwen3:14b | +| Commit message body | qwen3:14b | +| Ticket / client comment drafting | qwen3:14b | +| Summarize logs, diffs, incident notes | qwen3:14b | | Classify bug type, severity, category | qwen3:14b | | Extract structured data from text | qwen3:14b | -| Draft commit message from diff | qwen3:14b | -| Suggest refactor for a function | codestral:22b | -| Docstring / comment generation | codestral:22b | +| Code comment / docstring generation | codestral:22b | +| Refactor suggestions | codestral:22b | ## Review Policy -- Low-stakes output (summary, classification, draft) — use directly +- Documentation output (session logs, commit messages, comments) — Claude reviews before writing/posting - Code suggestions from codestral — always review before applying -- Never use Ollama for: auth decisions, credential handling, production migrations, security review +- Never use Ollama for: credentials, auth decisions, production migrations, security review, API payload field values diff --git a/.claude/commands/checkpoint.md b/.claude/commands/checkpoint.md index 4b27da2..991845a 100644 --- a/.claude/commands/checkpoint.md +++ b/.claude/commands/checkpoint.md @@ -20,17 +20,34 @@ Please create a comprehensive git checkpoint with the following steps: - Add ALL untracked files (new files) - Use `git add -A` or `git add .` to stage everything -4. **Create a detailed commit message**: +4. **Draft commit message body via Ollama** (documentation engine): - - **First line**: Write a clear, concise summary (50-72 chars) describing the primary change - - Use imperative mood (e.g., "Add feature" not "Added feature") - - Examples: "feat: add user authentication", "fix: resolve database connection issue", "refactor: improve API route structure" - - **Body**: Provide a detailed description including: - - What changes were made (list of key modifications) - - Why these changes were made (purpose/motivation) - - Any important technical details or decisions - - Breaking changes or migration notes if applicable - - **Footer**: Include co-author attribution as shown in the Git Safety Protocol + ```bash + # Resolve Ollama + if curl -s -m 2 http://localhost:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://localhost:11434" + elif curl -s -m 3 http://100.92.127.64:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://100.92.127.64:11434" + else OLLAMA=""; fi + + # Capture diff summary for Ollama prompt + { git diff --stat HEAD; echo "---"; git diff HEAD | head -200; } \ + > "C:/Users/guru/AppData/Local/Temp/checkpoint_diff.txt" + + # Ollama drafts the body; fallback to Claude if unavailable + if [ -n "$OLLAMA" ]; then + BODY=$(py -c " +import urllib.request, json +diff = open('C:/Users/guru/AppData/Local/Temp/checkpoint_diff.txt', encoding='utf-8').read() +prompt = 'Write a git commit message BODY only (not the summary line). Imperative mood. What changed and why. No filler. Under 150 words.\n\nDIFF:\n' + diff +body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':prompt}],'stream':False,'think':False}).encode() +res = json.loads(urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=60).read()) +print(res['message']['content']) +") + fi + ``` + + - **Summary line** (first line): Claude writes — 50-72 chars, imperative mood, from `git diff --stat` + - **Body**: Ollama draft (Claude reviews); Claude writes directly if Ollama unavailable + - **Footer**: `Co-Authored-By: Claude Sonnet 4.6 ` 5. **Execute the commit**: Create the commit with the properly formatted message following this repository's conventions. diff --git a/.claude/commands/save.md b/.claude/commands/save.md index e41c43e..a141dae 100644 --- a/.claude/commands/save.md +++ b/.claude/commands/save.md @@ -1,5 +1,56 @@ Save a COMPREHENSIVE session log to appropriate session-logs/ directory. This is critical for context recovery. +## Ollama drafting (documentation engine) + +Narrative sections are drafted by Ollama (qwen3:14b), then assembled with Claude-generated factual sections. Claude reviews the full document before writing. + +**Ollama drafts:** Session Summary, Key Decisions, Problems Encountered +**Claude owns (verbatim, never delegated):** Credentials, infrastructure IPs/hostnames, command outputs, file paths, pending tasks + +### Draft call + +```bash +# Check Ollama (reuse $OLLAMA across the save operation) +if curl -s -m 2 http://localhost:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://localhost:11434" +elif curl -s -m 3 http://100.92.127.64:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://100.92.127.64:11434" +else OLLAMA=""; fi + +# Write narrative prompt to temp file +cat > "C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt" << 'ENDPROMPT' +You are a technical session log writer for an MSP (managed service provider). +Write three sections of a session log in markdown. Be concise, factual, and technical. +No filler phrases. Use past tense. + +WORK DONE THIS SESSION: + + +Write these three sections only: + +## Session Summary +<2-4 paragraph narrative: what was accomplished, in what order, why> + +## Key Decisions + + +## Problems Encountered + +ENDPROMPT + +NARRATIVE=$(py -c " +import urllib.request, json +prompt = open('C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt', encoding='utf-8').read() +body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':prompt}],'stream':False,'think':False}).encode() +res = json.loads(urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=120).read()) +print(res['message']['content']) +") + +# Fallback: if OLLAMA empty, Claude writes narrative directly +``` + +Claude reviews the narrative output before assembling the final document. + +--- + ## Determine Correct Location **IMPORTANT: Save to project-specific or general session-logs based on work context**