Audited the Ollama reference (no wrapper script — it's the OLLAMA.md doc + inline
HTTP-API call pattern) against the live server (Ollama 0.30.8 on GURU-5070):
- /api/chat + think:false + res['message']['content'] confirmed working (clean
output, no thinking leak) -- the core call pattern is correct.
- All referenced models exist on the server (qwen3:8b, qwen3.6:latest, qwen3:14b,
codestral:22b, nomic-embed-text).
Real bug found + fixed: the "Preferred one-liner" auto-detected the endpoint with
`urlopen(...)` used as a truthiness test. urlopen RAISES URLError on a down host
(proven), so the ternary's fallback branch was dead code -- it crashed on a down
localhost instead of failing over to Beast, and it did a per-call probe that
contradicts the doc's own "read endpoint from identity.json, no probe" rule 30 lines
above. Replaced with the identity.json endpoint+model pattern (also swaps the
hardcoded qwen3:14b for the per-machine prose_model). Validated verbatim end-to-end.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>