ollama: fix broken endpoint auto-detect in OLLAMA.md one-liner (RTFM audit)
Audited the Ollama reference (no wrapper script — it's the OLLAMA.md doc + inline HTTP-API call pattern) against the live server (Ollama 0.30.8 on GURU-5070): - /api/chat + think:false + res['message']['content'] confirmed working (clean output, no thinking leak) -- the core call pattern is correct. - All referenced models exist on the server (qwen3:8b, qwen3.6:latest, qwen3:14b, codestral:22b, nomic-embed-text). Real bug found + fixed: the "Preferred one-liner" auto-detected the endpoint with `urlopen(...)` used as a truthiness test. urlopen RAISES URLError on a down host (proven), so the ternary's fallback branch was dead code -- it crashed on a down localhost instead of failing over to Beast, and it did a per-call probe that contradicts the doc's own "read endpoint from identity.json, no probe" rule 30 lines above. Replaced with the identity.json endpoint+model pattern (also swaps the hardcoded qwen3:14b for the per-machine prose_model). Validated verbatim end-to-end. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -83,18 +83,23 @@ If neither endpoint responds: verify Tailscale (`tailscale status`) and whether
|
||||
|
||||
Use the `/api/chat` endpoint with `think:false` for qwen3 models. The older `/api/generate` endpoint on qwen3 puts output into thinking tokens that don't appear in the `response` field — you'll get an empty response if you use `/api/generate`.
|
||||
|
||||
Preferred one-liner:
|
||||
Preferred one-liner — endpoint **and** model come from `identity.json` (consistent with
|
||||
**Endpoints** above; no per-call probe). The old inline auto-detect was REMOVED: it called
|
||||
`urlopen()` as a truthiness test, which *raises* `URLError` on a down host instead of
|
||||
yielding the fallback — so it crashed on a down localhost rather than failing over to Beast,
|
||||
and it violated the "no per-call probe" rule.
|
||||
```bash
|
||||
python -c "
|
||||
OLLAMA="${OLLAMA:-$(jq -r '.ollama.endpoint // .ollama.fallback // "http://localhost:11434"' .claude/identity.json)}"
|
||||
MODEL="${MODEL:-$(jq -r '.ollama.prose_model // "qwen3:14b"' .claude/identity.json)}"
|
||||
OLLAMA="$OLLAMA" MODEL="$MODEL" python -c "
|
||||
import urllib.request, json, sys, os
|
||||
OLLAMA = os.environ.get('OLLAMA') or ('http://localhost:11434' if __import__('urllib.request').request.urlopen(urllib.request.Request('http://localhost:11434/api/tags'),timeout=2) else 'http://100.101.122.4:11434')
|
||||
body = json.dumps({
|
||||
'model':'qwen3:14b',
|
||||
'model': os.environ['MODEL'],
|
||||
'messages':[{'role':'user','content': sys.argv[1]}],
|
||||
'stream':False,
|
||||
'think':False
|
||||
}).encode()
|
||||
res = json.loads(urllib.request.urlopen(urllib.request.Request(OLLAMA+'/api/chat', body), timeout=120).read())
|
||||
res = json.loads(urllib.request.urlopen(urllib.request.Request(os.environ['OLLAMA']+'/api/chat', body), timeout=120).read())
|
||||
print(res['message']['content'])
|
||||
" "Your prompt here"
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user