docs: establish Ollama as the documentation engine
Route all prose generation (session logs, commit messages, Syncro comments, client notes, code docs) through Ollama qwen3:14b by default. Claude reviews output and owns verbatim-accuracy sections (credentials, IPs, command outputs). GrepAI context lookups keep the Ollama service warm, eliminating the 30-50s cold-start in normal workflow. Updates: OLLAMA.md (documentation engine scope + warm-start note), CLAUDE.md (Ollama section), save.md (narrative drafting), checkpoint.md (commit message body drafting). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -241,11 +241,13 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t
|
||||
|
||||
## Local AI (Ollama)
|
||||
|
||||
Tier 0 — use before spawning Haiku agents for low-stakes tasks (free, fast, private):
|
||||
Tier 0 — **Ollama is the documentation engine.** Route all prose generation through it: session log narratives, commit messages, ticket comments, client notes, code docs. Claude reviews output, owns credentials/facts/execution.
|
||||
|
||||
- **DESKTOP-0O8A1RL:** `http://localhost:11434`
|
||||
- **Other machines:** `http://100.92.127.64:11434` (Tailscale required)
|
||||
- **Models:** `qwen3:14b` (summarize/classify/draft), `codestral:22b` (code suggestions — always review)
|
||||
- **Full reference:** `.claude/OLLAMA.md` (connection examples, model selection, review policy)
|
||||
- **Models:** `qwen3:14b` (all documentation/prose), `codestral:22b` (code suggestions — always review)
|
||||
- **Warm-start:** GrepAI keeps the Ollama service running; qwen3 VRAM swap is ~5s worst case, not 50s
|
||||
- **Full reference:** `.claude/OLLAMA.md` (documentation engine scope, model selection, review policy)
|
||||
|
||||
### GrepAI (Semantic Code Search)
|
||||
|
||||
|
||||
@@ -70,19 +70,61 @@ For code suggestions, swap `qwen3:14b` for `codestral:22b`. Codestral doesn't ne
|
||||
|
||||
Cold-start is ~30-50s on first call per model per session. Warm calls are 1-5s.
|
||||
|
||||
## Documentation Engine
|
||||
|
||||
**Ollama is the default documentation engine for all prose output.** Any time stored text needs to be generated — session logs, commit messages, ticket comments, client notes, code docs — route it through Ollama first. Claude reviews, corrects if needed, then writes or posts.
|
||||
|
||||
This keeps Claude tokens focused on reasoning, decisions, and execution. Ollama handles the writing.
|
||||
|
||||
### What Ollama owns
|
||||
|
||||
| Output | Model | Claude's role |
|
||||
|--------|-------|---------------|
|
||||
| Session log narrative (summary, decisions, problems) | qwen3:14b | Review + assemble with factual sections |
|
||||
| Commit message body | qwen3:14b | Review + execute git commit |
|
||||
| Syncro comment bodies + billing descriptions | qwen3:14b | Review checklist + post via API |
|
||||
| Ticket initial issue / description text | qwen3:14b | Review + post |
|
||||
| Client-facing notes and summaries | qwen3:14b | Review for accuracy |
|
||||
| Code comments and docstrings | codestral:22b | Review before applying |
|
||||
| Refactor suggestions | codestral:22b | Review before applying |
|
||||
|
||||
### What Claude always owns (never Ollama)
|
||||
|
||||
- Credentials, passwords, API keys — must be verbatim accurate
|
||||
- Infrastructure details, IPs, hostnames — must be verbatim accurate
|
||||
- Command outputs and error messages — verbatim from actual output
|
||||
- Security decisions, auth review, production migrations
|
||||
- Final field values on API payloads (rates, IDs, quantities)
|
||||
|
||||
### Warm-start and GrepAI
|
||||
|
||||
GrepAI uses `nomic-embed-text` for context lookups, which keeps the Ollama **service** running continuously. The 30-50s service cold-start is effectively eliminated in normal workflow. `qwen3:14b` may take ~5s to swap into VRAM if it hasn't been called recently, but that's the worst case — not 50s.
|
||||
|
||||
If the first Ollama call of a session needs to be fast, send a throwaway warm-up ping:
|
||||
```bash
|
||||
py -c "
|
||||
import urllib.request, json
|
||||
body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':'ok'}],'stream':False,'think':False}).encode()
|
||||
urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=60).read()
|
||||
print('warm')
|
||||
"
|
||||
```
|
||||
|
||||
## When to Use Which Model
|
||||
|
||||
| Task | Model |
|
||||
|------|-------|
|
||||
| Summarize logs, diffs, session notes | qwen3:14b |
|
||||
| Session log narrative sections | qwen3:14b |
|
||||
| Commit message body | qwen3:14b |
|
||||
| Ticket / client comment drafting | qwen3:14b |
|
||||
| Summarize logs, diffs, incident notes | qwen3:14b |
|
||||
| Classify bug type, severity, category | qwen3:14b |
|
||||
| Extract structured data from text | qwen3:14b |
|
||||
| Draft commit message from diff | qwen3:14b |
|
||||
| Suggest refactor for a function | codestral:22b |
|
||||
| Docstring / comment generation | codestral:22b |
|
||||
| Code comment / docstring generation | codestral:22b |
|
||||
| Refactor suggestions | codestral:22b |
|
||||
|
||||
## Review Policy
|
||||
|
||||
- Low-stakes output (summary, classification, draft) — use directly
|
||||
- Documentation output (session logs, commit messages, comments) — Claude reviews before writing/posting
|
||||
- Code suggestions from codestral — always review before applying
|
||||
- Never use Ollama for: auth decisions, credential handling, production migrations, security review
|
||||
- Never use Ollama for: credentials, auth decisions, production migrations, security review, API payload field values
|
||||
|
||||
@@ -20,17 +20,34 @@ Please create a comprehensive git checkpoint with the following steps:
|
||||
- Add ALL untracked files (new files)
|
||||
- Use `git add -A` or `git add .` to stage everything
|
||||
|
||||
4. **Create a detailed commit message**:
|
||||
4. **Draft commit message body via Ollama** (documentation engine):
|
||||
|
||||
- **First line**: Write a clear, concise summary (50-72 chars) describing the primary change
|
||||
- Use imperative mood (e.g., "Add feature" not "Added feature")
|
||||
- Examples: "feat: add user authentication", "fix: resolve database connection issue", "refactor: improve API route structure"
|
||||
- **Body**: Provide a detailed description including:
|
||||
- What changes were made (list of key modifications)
|
||||
- Why these changes were made (purpose/motivation)
|
||||
- Any important technical details or decisions
|
||||
- Breaking changes or migration notes if applicable
|
||||
- **Footer**: Include co-author attribution as shown in the Git Safety Protocol
|
||||
```bash
|
||||
# Resolve Ollama
|
||||
if curl -s -m 2 http://localhost:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://localhost:11434"
|
||||
elif curl -s -m 3 http://100.92.127.64:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://100.92.127.64:11434"
|
||||
else OLLAMA=""; fi
|
||||
|
||||
# Capture diff summary for Ollama prompt
|
||||
{ git diff --stat HEAD; echo "---"; git diff HEAD | head -200; } \
|
||||
> "C:/Users/guru/AppData/Local/Temp/checkpoint_diff.txt"
|
||||
|
||||
# Ollama drafts the body; fallback to Claude if unavailable
|
||||
if [ -n "$OLLAMA" ]; then
|
||||
BODY=$(py -c "
|
||||
import urllib.request, json
|
||||
diff = open('C:/Users/guru/AppData/Local/Temp/checkpoint_diff.txt', encoding='utf-8').read()
|
||||
prompt = 'Write a git commit message BODY only (not the summary line). Imperative mood. What changed and why. No filler. Under 150 words.\n\nDIFF:\n' + diff
|
||||
body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':prompt}],'stream':False,'think':False}).encode()
|
||||
res = json.loads(urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=60).read())
|
||||
print(res['message']['content'])
|
||||
")
|
||||
fi
|
||||
```
|
||||
|
||||
- **Summary line** (first line): Claude writes — 50-72 chars, imperative mood, from `git diff --stat`
|
||||
- **Body**: Ollama draft (Claude reviews); Claude writes directly if Ollama unavailable
|
||||
- **Footer**: `Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>`
|
||||
|
||||
5. **Execute the commit**: Create the commit with the properly formatted message following this repository's conventions.
|
||||
|
||||
|
||||
@@ -1,5 +1,56 @@
|
||||
Save a COMPREHENSIVE session log to appropriate session-logs/ directory. This is critical for context recovery.
|
||||
|
||||
## Ollama drafting (documentation engine)
|
||||
|
||||
Narrative sections are drafted by Ollama (qwen3:14b), then assembled with Claude-generated factual sections. Claude reviews the full document before writing.
|
||||
|
||||
**Ollama drafts:** Session Summary, Key Decisions, Problems Encountered
|
||||
**Claude owns (verbatim, never delegated):** Credentials, infrastructure IPs/hostnames, command outputs, file paths, pending tasks
|
||||
|
||||
### Draft call
|
||||
|
||||
```bash
|
||||
# Check Ollama (reuse $OLLAMA across the save operation)
|
||||
if curl -s -m 2 http://localhost:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://localhost:11434"
|
||||
elif curl -s -m 3 http://100.92.127.64:11434/api/tags >/dev/null 2>&1; then OLLAMA="http://100.92.127.64:11434"
|
||||
else OLLAMA=""; fi
|
||||
|
||||
# Write narrative prompt to temp file
|
||||
cat > "C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt" << 'ENDPROMPT'
|
||||
You are a technical session log writer for an MSP (managed service provider).
|
||||
Write three sections of a session log in markdown. Be concise, factual, and technical.
|
||||
No filler phrases. Use past tense.
|
||||
|
||||
WORK DONE THIS SESSION:
|
||||
<paste bullet list of what happened>
|
||||
|
||||
Write these three sections only:
|
||||
|
||||
## Session Summary
|
||||
<2-4 paragraph narrative: what was accomplished, in what order, why>
|
||||
|
||||
## Key Decisions
|
||||
<bullet list of non-obvious decisions made and their rationale>
|
||||
|
||||
## Problems Encountered
|
||||
<bullet list of problems hit and how each was resolved; omit if none>
|
||||
ENDPROMPT
|
||||
|
||||
NARRATIVE=$(py -c "
|
||||
import urllib.request, json
|
||||
prompt = open('C:/Users/guru/AppData/Local/Temp/save_narrative_prompt.txt', encoding='utf-8').read()
|
||||
body = json.dumps({'model':'qwen3:14b','messages':[{'role':'user','content':prompt}],'stream':False,'think':False}).encode()
|
||||
res = json.loads(urllib.request.urlopen(urllib.request.Request('$OLLAMA/api/chat', body), timeout=120).read())
|
||||
print(res['message']['content'])
|
||||
")
|
||||
|
||||
# Fallback: if OLLAMA empty, Claude writes narrative directly
|
||||
```
|
||||
|
||||
Claude reviews the narrative output before assembling the final document.
|
||||
|
||||
---
|
||||
|
||||
## Determine Correct Location
|
||||
|
||||
**IMPORTANT: Save to project-specific or general session-logs based on work context**
|
||||
|
||||
Reference in New Issue
Block a user