Ollama shared via Tailscale: per-machine URL detection + Howard access
CLAUDE.md: Ollama section rewritten. localhost for Mike's workstation, 100.92.127.64:11434 via Tailscale for all other machines. Claude reads identity.json hostname to determine which URL to use. Firewall rule restricts to Tailscale 100.0.0.0/8 subnet only. ONBOARDING.md: updated Ollama section for remote access. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -290,7 +290,7 @@ Service account token in vault: `infrastructure/1password-service-account.sops.y
|
||||
|
||||
## Local AI (Ollama)
|
||||
|
||||
Ollama runs locally with GPU acceleration for tasks that don't need Claude-level reasoning.
|
||||
Ollama runs on Mike's workstation (DESKTOP-0O8A1RL) with GPU acceleration. Available to all team members via Tailscale.
|
||||
|
||||
| Model | Size | Use For |
|
||||
|-------|------|---------|
|
||||
@@ -298,11 +298,37 @@ Ollama runs locally with GPU acceleration for tasks that don't need Claude-level
|
||||
| `codestral:22b` | 12 GB | Code generation, refactoring suggestions, docstrings |
|
||||
| `nomic-embed-text` | 274 MB | Embeddings only (used by GrepAI) |
|
||||
|
||||
### How to connect
|
||||
|
||||
**On Mike's workstation (local):**
|
||||
```bash
|
||||
# Simple prompt
|
||||
curl -s http://localhost:11434/api/generate -d '{"model":"qwen3:14b","prompt":"...","stream":false}' | jq -r '.response'
|
||||
```
|
||||
|
||||
**On any other machine via Tailscale:**
|
||||
```bash
|
||||
curl -s http://100.92.127.64:11434/api/generate -d '{"model":"qwen3:14b","prompt":"...","stream":false}' | jq -r '.response'
|
||||
```
|
||||
|
||||
### Per-machine setup
|
||||
|
||||
Read `.claude/identity.json` to determine which machine you're on:
|
||||
- **DESKTOP-0O8A1RL** (Mike's workstation): Ollama runs locally. Use `localhost:11434`.
|
||||
- **Any other machine** (Howard's laptop, other workstations): Ollama is remote via Tailscale. Use `100.92.127.64:11434`. Requires Tailscale to be connected.
|
||||
|
||||
**To check if Ollama is reachable:**
|
||||
```bash
|
||||
curl -s http://100.92.127.64:11434/api/tags | python -c "import sys,json; [print(m['name']) for m in json.load(sys.stdin).get('models',[])]"
|
||||
```
|
||||
|
||||
If it fails: verify Tailscale is connected (`tailscale status`), and that Mike's workstation is online.
|
||||
|
||||
### Access control
|
||||
|
||||
- Firewall rule on Mike's workstation allows port 11434 ONLY from Tailscale subnet (100.0.0.0/8)
|
||||
- NOT exposed to LAN, VPN, or internet
|
||||
- Binding: `OLLAMA_HOST=0.0.0.0:11434` (all interfaces, firewall restricts)
|
||||
|
||||
**Review policy:** Always review Critical/High impact Ollama outputs (auth, security, migrations, production). Trust Low impact (classification, formatting). Flag uncertainty to user.
|
||||
|
||||
### GrepAI (Semantic Code Search)
|
||||
|
||||
@@ -154,9 +154,16 @@ Ollama runs AI models locally on your GPU. Used for tasks that don't need Claude
|
||||
- `codestral:22b` — code generation assistance
|
||||
- `nomic-embed-text` — embeddings for semantic search
|
||||
|
||||
**Status:** May not be installed/running on your machine yet. If Claude says "Ollama not available", it's fine — it falls back to doing the work itself.
|
||||
**Ollama runs on Mike's workstation** and is shared via Tailscale. You don't need to install it locally.
|
||||
|
||||
**To check:** `curl http://localhost:11434/api/tags` — if it returns models, Ollama is running.
|
||||
**To use from your machine (Tailscale must be connected):**
|
||||
```bash
|
||||
curl -s http://100.92.127.64:11434/api/tags
|
||||
```
|
||||
|
||||
If that returns models, you're connected. Claude automatically uses the right URL based on which machine you're on (reads from `identity.json`).
|
||||
|
||||
If it fails: check that Tailscale is connected (`tailscale status`) and Mike's workstation is online.
|
||||
|
||||
### GrepAI (semantic code search)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user