diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index a1194fe..f204997 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -104,17 +104,20 @@ You are NOT an executor. You coordinate specialized agents and preserve your con ### Model Routing (Complexity-Based) -Before spawning an agent, pick a tier from `.claude/COMPLEXITY_ROUTING.md`: +Before spawning an agent, pick a tier: | Tier | Model | When | |------|-------|------| -| 1 | `haiku` | Lookup, format, summarize, doc — no code changes | +| 0 | **Ollama** (local) | Low-stakes: summarize, classify, extract, draft, format — no code changes, output reviewed before use | +| 1 | `haiku` | Ollama unavailable, or task needs tool use / file access an agent provides | | 2 | (inherit) | Standard code, DB, tests, git — most work | | 3 | `opus` | Architecture, security, ambiguous failures, production risk | -**Bump rule:** if the request involves `security`, `auth`, `credential`, `migration`, `production`, or `data loss` — bump one tier up. +**Tier 0 rule:** Always try Ollama first for low-stakes work. It's free, fast, and private. Use `qwen3:14b` for general tasks; `codestral:22b` for code suggestions. Fall back to Haiku only if Ollama is unreachable or the task requires agent tool use. -Pass `model: "haiku"` or `model: "opus"` explicitly. Omit for Tier 2 (inherits session model). +**Bump rule:** if the request involves `security`, `auth`, `credential`, `migration`, `production`, or `data loss` — bump one tier up (Ollama → Haiku, Haiku → inherit, inherit → opus). + +Pass `model: "haiku"` or `model: "opus"` explicitly. Omit for Tier 2 (inherits session model). Tier 0 is a direct Bash call, not an agent spawn — see Ollama section below. ### Coordination Flow @@ -412,6 +415,52 @@ If it fails: verify Tailscale is connected (`tailscale status`), and that Mike's - NOT exposed to LAN, VPN, or internet - Binding: `OLLAMA_HOST=0.0.0.0:11434` (all interfaces, firewall restricts) +### Delegation pattern (Tier 0 — use instead of spawning a Haiku agent) + +Determine the endpoint from identity.json, then call directly with the Bash tool: + +```bash +# Resolve endpoint once per session +OLLAMA=$([ "$(jq -r .machine ~/.claude/identity.json 2>/dev/null)" = "DESKTOP-0O8A1RL" ] \ + && echo "http://localhost:11434" || echo "http://100.92.127.64:11434") + +# General task (summarize, classify, extract, draft) +curl -s "$OLLAMA/api/generate" \ + -d "{\"model\":\"qwen3:14b\",\"prompt\":\"$(echo "$PROMPT" | python3 -c 'import sys,json; print(json.dumps(sys.stdin.read()))'| tr -d '\"')\",\"stream\":false}" \ + | python3 -c "import sys,json; print(json.load(sys.stdin).get('response',''))" + +# Code suggestion (refactor ideas, docstrings — NOT production code) +# Same call, model: "codestral:22b" +``` + +**Practical shorthand** — for one-off inline prompts, use python3 to avoid escaping issues: + +```bash +python3 -c " +import urllib.request, json, sys +url = 'http://localhost:11434/api/generate' # or 100.92.127.64 +body = json.dumps({'model':'qwen3:14b','prompt': sys.argv[1],'stream':False}).encode() +res = json.loads(urllib.request.urlopen(urllib.request.Request(url, body)).read()) +print(res['response']) +" "Summarize these changes in one sentence: ..." +``` + +**When to use which model:** + +| Task | Model | +|------|-------| +| Summarize logs, diffs, session notes | qwen3:14b | +| Classify bug type, severity, category | qwen3:14b | +| Extract structured data from text output | qwen3:14b | +| Draft commit message from diff | qwen3:14b | +| Suggest refactor for a function (review output) | codestral:22b | +| Docstring / comment generation | codestral:22b | + +**Review policy:** +- Low-stakes output (summary, label, draft) — use directly, no review needed +- Code suggestions from codestral — always review before applying +- Never use Ollama for: auth decisions, credential handling, production migrations, security review + **Review policy:** Always review Critical/High impact Ollama outputs (auth, security, migrations, production). Trust Low impact (classification, formatting). Flag uncertainty to user. ### GrepAI (Semantic Code Search)