sync: Add Wrightstown Solar and Smart Home projects

New projects from 2026-02-09 research session: Wrightstown Solar: - DIY 48V LiFePO4 battery storage (EVE C40 cells) - Victron MultiPlus II whole-house UPS design - BMS comparison (Victron CAN bus compatible) - EV salvage analysis (new cells won) - Full parts list and budget Wrightstown Smart Home: - Home Assistant Yellow setup (local voice, no cloud) - Local LLM server build guide (Ollama + RTX 4090) - Hybrid LLM bridge (LiteLLM + Claude API + Grok API) - Network security (VLAN architecture, PII sanitization) Machine: ACG-M-L5090 Timestamp: 2026-02-09 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-10 18:44:35 -07:00
parent fee9cc01ac
commit aaf4172b3c
12 changed files with 1953 additions and 0 deletions
--- a/projects/wrightstown-smarthome/documentation/hybrid-bridge.md
+++ b/projects/wrightstown-smarthome/documentation/hybrid-bridge.md
@@ -0,0 +1,290 @@
+# Hybrid LLM Bridge - Local + Cloud Routing
+
+**Created:** 2026-02-09
+**Purpose:** Route queries intelligently between local Ollama, Claude API, and Grok API
+
+---
+
+## Architecture
+
+```
+User Query (voice, chat, HA automation)
+              |
+      [LiteLLM Proxy]
+       localhost:4000
+              |
+     Routing Decision
+     /       |        \
+[Ollama]  [Claude]   [Grok]
+ Local    Anthropic    xAI
+ Free     Reasoning   Search
+ Private  $3/$15/1M   $3/$15/1M
+```
+
+---
+
+## Recommended: LiteLLM Proxy
+
+Unified API gateway that presents a single OpenAI-compatible endpoint. Everything talks to `localhost:4000` and LiteLLM routes to the right backend.
+
+### Installation
+
+```bash
+pip install litellm[proxy]
+```
+
+### Configuration (`config.yaml`)
+
+```yaml
+model_list:
+  # Local models (free, private)
+  - model_name: local-fast
+    litellm_params:
+      model: ollama/qwen2.5:7b
+      api_base: http://localhost:11434
+
+  - model_name: local-reasoning
+    litellm_params:
+      model: ollama/llama3.1:70b-q4
+      api_base: http://localhost:11434
+
+  # Cloud: Claude (complex reasoning)
+  - model_name: cloud-reasoning
+    litellm_params:
+      model: claude-sonnet-4-5-20250929
+      api_key: sk-ant-XXXXX
+
+  - model_name: cloud-reasoning-cheap
+    litellm_params:
+      model: claude-haiku-4-5-20251001
+      api_key: sk-ant-XXXXX
+
+  # Cloud: Grok (internet search)
+  - model_name: cloud-search
+    litellm_params:
+      model: grok-4
+      api_key: xai-XXXXX
+      api_base: https://api.x.ai/v1
+
+router_settings:
+  routing_strategy: simple-shuffle
+  allowed_fails: 2
+  num_retries: 3
+
+  budget_policy:
+    local-fast: unlimited
+    local-reasoning: unlimited
+    cloud-reasoning: $50/month
+    cloud-reasoning-cheap: $25/month
+    cloud-search: $25/month
+```
+
+### Start the Proxy
+
+```bash
+litellm --config config.yaml --port 4000
+```
+
+### Usage
+
+Everything talks to `http://localhost:4000` with OpenAI-compatible format:
+
+```python
+import openai
+
+client = openai.OpenAI(
+    api_key="anything",  # LiteLLM doesn't need this for local
+    base_url="http://localhost:4000"
+)
+
+# Route to local
+response = client.chat.completions.create(
+    model="local-fast",
+    messages=[{"role": "user", "content": "Turn on the lights"}]
+)
+
+# Route to Claude
+response = client.chat.completions.create(
+    model="cloud-reasoning",
+    messages=[{"role": "user", "content": "Analyze my energy usage patterns"}]
+)
+
+# Route to Grok
+response = client.chat.completions.create(
+    model="cloud-search",
+    messages=[{"role": "user", "content": "What's the current electricity rate in PA?"}]
+)
+```
+
+---
+
+## Routing Strategy
+
+### What Goes Where
+
+**Local (Ollama) -- Default for everything private:**
+- Home automation commands ("turn on lights", "set thermostat to 72")
+- Sensor data queries ("what's the temperature in the garage?")
+- Camera-related queries (never send video to cloud)
+- Personal information queries
+- Simple Q&A
+- Quick lookups from local knowledge
+
+**Claude API -- Complex reasoning tasks:**
+- Detailed analysis ("analyze my energy trends this month")
+- Code generation ("write an HA automation for...")
+- Long-form content creation
+- Multi-step reasoning problems
+- Function calling for HA service control
+
+**Grok API -- Internet/real-time data:**
+- Current events ("latest news on solar tariffs")
+- Real-time pricing ("current electricity rates")
+- Weather data (if not using local integration)
+- Web searches
+- Anything requiring information the local model doesn't have
+
+### Manual vs Automatic Routing
+
+**Phase 1 (Start here):** Manual model selection
+- User picks "local-fast", "cloud-reasoning", or "cloud-search" in Open WebUI
+- Simple, no mistakes, full control
+- Good for learning which queries work best where
+
+**Phase 2 (Later):** Keyword-based routing in LiteLLM
+- Route based on keywords in the query
+- "search", "latest", "current" --> Grok
+- "analyze", "explain in detail", "write code" --> Claude
+- Everything else --> local
+
+**Phase 3 (Advanced):** Semantic routing
+- Use sentence embeddings to classify query intent
+- Small local model (all-MiniLM-L6-v2) classifies in 50-200ms
+- Most intelligent routing, but requires Python development
+
+---
+
+## Cloud API Details
+
+### Claude (Anthropic)
+
+**Endpoint:** `https://api.anthropic.com/v1/messages`
+**Get API key:** https://console.anthropic.com/
+
+**Pricing (2025-2026):**
+
+| Model | Input/1M tokens | Output/1M tokens | Best For |
+|---|---|---|---|
+| Claude Haiku 4.5 | $0.50 | $2.50 | Fast, cheap tasks |
+| Claude Sonnet 4.5 | $3.00 | $15.00 | Best balance |
+| Claude Opus 4.5 | $5.00 | $25.00 | Top quality |
+
+**Cost optimization:**
+- Prompt caching: 90% savings on repeated system prompts
+- Use Haiku for simple tasks, Sonnet for complex ones
+- Batch processing available for non-urgent tasks
+
+**Features:**
+- 200k context window
+- Extended thinking mode
+- Function calling (perfect for HA control)
+- Vision support (could analyze charts, screenshots)
+
+### Grok (xAI)
+
+**Endpoint:** `https://api.x.ai/v1/chat/completions`
+**Get API key:** https://console.x.ai/
+**Format:** OpenAI SDK compatible
+
+**Pricing:**
+
+| Model | Input/1M tokens | Output/1M tokens | Best For |
+|---|---|---|---|
+| Grok 4.1 Fast | $0.20 | $1.00 | Budget queries |
+| Grok 4 | $3.00 | $15.00 | Full capability |
+
+**Free credits:** $25 new user + $150/month if opting into data sharing program
+
+**Features:**
+- 2 million token context window (industry-leading)
+- Real-time X (Twitter) integration
+- Internet search capability
+- OpenAI SDK compatibility
+
+---
+
+## Monthly Cost Estimates
+
+### Conservative Use (80/15/5 Split, 1000 queries/month)
+
+| Route | Queries | Model | Cost |
+|---|---|---|---|
+| Local (80%) | 800 | Ollama | $0 |
+| Claude (15%) | 150 | Haiku 4.5 | ~$0.45 |
+| Grok (5%) | 50 | Grok 4.1 Fast | ~$0.07 |
+| **Total** | **1000** | | **~$0.52/month** |
+
+### Heavy Use (60/25/15 Split, 3000 queries/month)
+
+| Route | Queries | Model | Cost |
+|---|---|---|---|
+| Local (60%) | 1800 | Ollama | $0 |
+| Claude (25%) | 750 | Sonnet 4.5 | ~$15 |
+| Grok (15%) | 450 | Grok 4 | ~$9 |
+| **Total** | **3000** | | **~$24/month** |
+
+**Add electricity for LLM server:** ~$15-30/month (RTX 4090 build)
+
+---
+
+## Home Assistant Integration
+
+### Connect HA to LiteLLM Proxy
+
+**Option 1: Extended OpenAI Conversation (Recommended)**
+
+Install via HACS, then configure:
+- API Base URL: `http://<llm-server-ip>:4000/v1`
+- API Key: (any string, LiteLLM doesn't validate for local)
+- Model: `local-fast` (or any model name from your config)
+
+This gives HA natural language control:
+- "Turn off all lights downstairs" --> local LLM understands --> calls HA service
+- "What's my battery charge level?" --> queries HA entities --> responds
+
+**Option 2: Native Ollama Integration**
+
+Settings > Integrations > Ollama:
+- URL: `http://<llm-server-ip>:11434`
+- Simpler but bypasses the routing layer
+
+### Voice Assistant Pipeline
+
+```
+Wake word detected ("Hey Jarvis")
+         |
+   Whisper (speech-to-text, local)
+         |
+   Query text
+         |
+   Extended OpenAI Conversation
+         |
+   LiteLLM Proxy (routing)
+         |
+   Response text
+         |
+   Piper (text-to-speech, local)
+         |
+   Speaker output
+```
+
+---
+
+## Sources
+
+- https://docs.litellm.ai/
+- https://github.com/open-webui/open-webui
+- https://console.anthropic.com/
+- https://docs.x.ai/developers/models
+- https://github.com/jekalmin/extended_openai_conversation
+- https://github.com/aurelio-labs/semantic-router