Files

Mike Swanson 936ea49b33 fix: replace python3 with py/jq throughout scripts and docs

Windows Store python3 stub returns exit 49 instead of running Python.
Replace with: py (Windows launcher) for actual Python code, jq for
simple JSON extraction. Reorder fallback loops to try py first.
Add Bash(py:*) to settings.local.json allowlist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-20 12:14:43 -07:00

2.2 KiB

Raw Blame History

Ollama — Local AI Reference

Ollama runs on Mike's workstation (DESKTOP-0O8A1RL) with GPU acceleration. Available to all team members via Tailscale.

Models

Model	Size	Use For
`qwen3:14b`	9.3 GB	Summarization, classification, data extraction, drafting
`codestral:22b`	12 GB	Code generation, refactoring suggestions, docstrings
`nomic-embed-text`	274 MB	Embeddings only (used by GrepAI)

Endpoints

DESKTOP-0O8A1RL (local): http://localhost:11434
Any other machine (Tailscale required): http://100.92.127.64:11434

Check reachability:

curl -s http://100.92.127.64:11434/api/tags | jq -r '.models[].name'

If it fails: verify Tailscale is connected (tailscale status) and Mike's workstation is online.

Access Control

Port 11434 allowed ONLY from Tailscale subnet (100.0.0.0/8)
NOT exposed to LAN, VPN, or internet
Binding: OLLAMA_HOST=0.0.0.0:11434 (firewall restricts)

Calling Ollama

Resolve endpoint from identity.json first:

OLLAMA=$([ "$(jq -r .machine .claude/identity.json 2>/dev/null)" = "DESKTOP-0O8A1RL" ] \
  && echo "http://localhost:11434" || echo "http://100.92.127.64:11434")

Preferred one-liner (avoids shell escaping):

py -c "
import urllib.request, json, sys
url = 'http://localhost:11434/api/generate'
body = json.dumps({'model':'qwen3:14b','prompt': sys.argv[1],'stream':False}).encode()
res = json.loads(urllib.request.urlopen(urllib.request.Request(url, body)).read())
print(res['response'])
" "Your prompt here"

For code suggestions, swap qwen3:14b for codestral:22b.

When to Use Which Model

Task	Model
Summarize logs, diffs, session notes	qwen3:14b
Classify bug type, severity, category	qwen3:14b
Extract structured data from text	qwen3:14b
Draft commit message from diff	qwen3:14b
Suggest refactor for a function	codestral:22b
Docstring / comment generation	codestral:22b

Review Policy

Low-stakes output (summary, classification, draft) — use directly
Code suggestions from codestral — always review before applying
Never use Ollama for: auth decisions, credential handling, production migrations, security review

2.2 KiB Raw Blame History