diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index cd3a58b..56e0d95 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -290,7 +290,7 @@ Service account token in vault: `infrastructure/1password-service-account.sops.y ## Local AI (Ollama) -Ollama runs locally with GPU acceleration for tasks that don't need Claude-level reasoning. +Ollama runs on Mike's workstation (DESKTOP-0O8A1RL) with GPU acceleration. Available to all team members via Tailscale. | Model | Size | Use For | |-------|------|---------| @@ -298,11 +298,37 @@ Ollama runs locally with GPU acceleration for tasks that don't need Claude-level | `codestral:22b` | 12 GB | Code generation, refactoring suggestions, docstrings | | `nomic-embed-text` | 274 MB | Embeddings only (used by GrepAI) | +### How to connect + +**On Mike's workstation (local):** ```bash -# Simple prompt curl -s http://localhost:11434/api/generate -d '{"model":"qwen3:14b","prompt":"...","stream":false}' | jq -r '.response' ``` +**On any other machine via Tailscale:** +```bash +curl -s http://100.92.127.64:11434/api/generate -d '{"model":"qwen3:14b","prompt":"...","stream":false}' | jq -r '.response' +``` + +### Per-machine setup + +Read `.claude/identity.json` to determine which machine you're on: +- **DESKTOP-0O8A1RL** (Mike's workstation): Ollama runs locally. Use `localhost:11434`. +- **Any other machine** (Howard's laptop, other workstations): Ollama is remote via Tailscale. Use `100.92.127.64:11434`. Requires Tailscale to be connected. + +**To check if Ollama is reachable:** +```bash +curl -s http://100.92.127.64:11434/api/tags | python -c "import sys,json; [print(m['name']) for m in json.load(sys.stdin).get('models',[])]" +``` + +If it fails: verify Tailscale is connected (`tailscale status`), and that Mike's workstation is online. + +### Access control + +- Firewall rule on Mike's workstation allows port 11434 ONLY from Tailscale subnet (100.0.0.0/8) +- NOT exposed to LAN, VPN, or internet +- Binding: `OLLAMA_HOST=0.0.0.0:11434` (all interfaces, firewall restricts) + **Review policy:** Always review Critical/High impact Ollama outputs (auth, security, migrations, production). Trust Low impact (classification, formatting). Flag uncertainty to user. ### GrepAI (Semantic Code Search) diff --git a/.claude/ONBOARDING.md b/.claude/ONBOARDING.md index 96447d2..cf467d6 100644 --- a/.claude/ONBOARDING.md +++ b/.claude/ONBOARDING.md @@ -154,9 +154,16 @@ Ollama runs AI models locally on your GPU. Used for tasks that don't need Claude - `codestral:22b` — code generation assistance - `nomic-embed-text` — embeddings for semantic search -**Status:** May not be installed/running on your machine yet. If Claude says "Ollama not available", it's fine — it falls back to doing the work itself. +**Ollama runs on Mike's workstation** and is shared via Tailscale. You don't need to install it locally. -**To check:** `curl http://localhost:11434/api/tags` — if it returns models, Ollama is running. +**To use from your machine (Tailscale must be connected):** +```bash +curl -s http://100.92.127.64:11434/api/tags +``` + +If that returns models, you're connected. Claude automatically uses the right URL based on which machine you're on (reads from `identity.json`). + +If it fails: check that Tailscale is connected (`tailscale status`) and Mike's workstation is online. ### GrepAI (semantic code search)