Files
claudetools/.claude/OLLAMA.md
Mike Swanson 936ea49b33 fix: replace python3 with py/jq throughout scripts and docs
Windows Store python3 stub returns exit 49 instead of running Python.
Replace with: py (Windows launcher) for actual Python code, jq for
simple JSON extraction. Reorder fallback loops to try py first.
Add Bash(py:*) to settings.local.json allowlist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 12:14:43 -07:00

2.2 KiB

Ollama — Local AI Reference

Ollama runs on Mike's workstation (DESKTOP-0O8A1RL) with GPU acceleration. Available to all team members via Tailscale.

Models

Model Size Use For
qwen3:14b 9.3 GB Summarization, classification, data extraction, drafting
codestral:22b 12 GB Code generation, refactoring suggestions, docstrings
nomic-embed-text 274 MB Embeddings only (used by GrepAI)

Endpoints

  • DESKTOP-0O8A1RL (local): http://localhost:11434
  • Any other machine (Tailscale required): http://100.92.127.64:11434

Check reachability:

curl -s http://100.92.127.64:11434/api/tags | jq -r '.models[].name'

If it fails: verify Tailscale is connected (tailscale status) and Mike's workstation is online.

Access Control

  • Port 11434 allowed ONLY from Tailscale subnet (100.0.0.0/8)
  • NOT exposed to LAN, VPN, or internet
  • Binding: OLLAMA_HOST=0.0.0.0:11434 (firewall restricts)

Calling Ollama

Resolve endpoint from identity.json first:

OLLAMA=$([ "$(jq -r .machine .claude/identity.json 2>/dev/null)" = "DESKTOP-0O8A1RL" ] \
  && echo "http://localhost:11434" || echo "http://100.92.127.64:11434")

Preferred one-liner (avoids shell escaping):

py -c "
import urllib.request, json, sys
url = 'http://localhost:11434/api/generate'
body = json.dumps({'model':'qwen3:14b','prompt': sys.argv[1],'stream':False}).encode()
res = json.loads(urllib.request.urlopen(urllib.request.Request(url, body)).read())
print(res['response'])
" "Your prompt here"

For code suggestions, swap qwen3:14b for codestral:22b.

When to Use Which Model

Task Model
Summarize logs, diffs, session notes qwen3:14b
Classify bug type, severity, category qwen3:14b
Extract structured data from text qwen3:14b
Draft commit message from diff qwen3:14b
Suggest refactor for a function codestral:22b
Docstring / comment generation codestral:22b

Review Policy

  • Low-stakes output (summary, classification, draft) — use directly
  • Code suggestions from codestral — always review before applying
  • Never use Ollama for: auth decisions, credential handling, production migrations, security review