sync: auto-sync from GURU-KALI at 2026-05-26 20:08:37

Author: Mike Swanson Machine: GURU-KALI Timestamp: 2026-05-26 20:08:37
2026-05-26 20:08:38 -07:00
parent 251bb3546b
commit 262fd8de62
3 changed files with 15 additions and 28 deletions
--- a/.claude/memory/feedback_ollama_tier0_routing.md
+++ b/.claude/memory/feedback_ollama_tier0_routing.md
@@ -15,19 +15,12 @@ Route Tier-0 tasks (summaries, classifications, drafts, extractions) through Oll
 - Suggesting refactors / generating docstrings → codestral:22b (then review)
 - NEVER for: auth decisions, credential handling, production migrations, security review, citation work, production-change scripts

-**Endpoint resolution — the remote fallback is a PER-MACHINE choice in `.claude/identity.json` `ollama_fallback`, never hardcoded:**
+**Endpoint resolution — machine config is centralized in `.claude/identity.json` `ollama` (Phase 2, 2026-05-26). Read the declared endpoint; no curl probe per call:**
 ```bash
-LOCAL="http://localhost:11434"
-FALLBACK=$(python3 -c "import json;print((json.load(open('.claude/identity.json')).get('ollama_fallback') or {}).get('endpoint',''))" 2>/dev/null)
-if curl -s -m 2 "$LOCAL/api/tags" >/dev/null 2>&1; then
-    OLLAMA="$LOCAL"                 # local Ollama is up — use it
-elif [ -n "$FALLBACK" ]; then
-    OLLAMA="$FALLBACK"              # per-machine fallback from identity.json
-else
-    OLLAMA="$LOCAL"                 # no fallback configured — local only
-fi
+OLLAMA=$(jq -r '.ollama.endpoint // .ollama.fallback // "http://localhost:11434"' .claude/identity.json)
+MODEL=$(jq -r '.ollama.prose_model // "qwen3:14b"' .claude/identity.json)
 ```
-Each machine sets its own `ollama_fallback` in identity.json, e.g. `{"host":"GURU-BEAST-ROG","endpoint":"http://100.101.122.4:11434"}`. GURU-BEAST-ROG (RTX 4090, always on) is the usual choice; GURU-KALI is set to it (confirmed 2026-05-26). A machine with local models loaded (e.g. Howard-Home: qwen3:14b, codestral:22b, nomic-embed-text, qwen3-coder:30b) can leave `ollama_fallback` unset/local — zero Tailscale hop. Do NOT bake a fallback IP into shared files (memory, OLLAMA.md, CLAUDE.md) — read it from identity.json.
+`migrate-identity.sh` populates the `ollama` object per machine: `endpoint` (the one to use — localhost if a local Ollama was present at migration, else the Tailscale host), `fallback` (backup, usually GURU-BEAST-ROG `http://100.101.122.4:11434`, always-on RTX 4090), and `prose_model` (qwen3:8b on 12 GB boxes like GURU-5070, qwen3:14b elsewhere). GURU-KALI: endpoint+fallback = Beast (no local Ollama). Re-run `migrate-identity.sh` to re-detect after an Ollama/network change. Don't hardcode endpoints/IPs in shared files — read identity.json. (Superseded the interim `ollama_fallback` field from earlier 2026-05-26.)

 **Call pattern for qwen3 — use `/api/chat` with `think:false`**, NOT `/api/generate`. qwen3 on generate endpoint dumps reasoning into internal thinking tokens and returns empty `response` field. Chat endpoint with `think:false` returns clean content in `message.content`: