sync: Auto-sync from ACG-M-L5090 at 2026-01-22 19:22:24

Synced files: - Grepai optimization documentation - Ollama Assistant MCP server implementation - Session logs and context updates Machine: ACG-M-L5090 Timestamp: 2026-01-22 19:22:24 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-22 19:22:54 -07:00
parent 63ab144c8f
commit eca8fe820e
7 changed files with 1782 additions and 0 deletions
--- a/mcp-servers/ollama-assistant/INSTALL.md
+++ b/mcp-servers/ollama-assistant/INSTALL.md
@@ -0,0 +1,345 @@
+# Ollama MCP Server Installation Guide
+
+Follow these steps to set up local AI assistance for Claude Code.
+
+---
+
+## Step 1: Install Ollama
+
+**Option A: Using winget (Recommended)**
+```powershell
+winget install Ollama.Ollama
+```
+
+**Option B: Manual Download**
+1. Go to https://ollama.ai/download
+2. Download the Windows installer
+3. Run the installer
+
+**Verify Installation:**
+```powershell
+ollama --version
+```
+
+Expected output: `ollama version is X.Y.Z`
+
+---
+
+## Step 2: Start Ollama Server
+
+**Start the server:**
+```powershell
+ollama serve
+```
+
+Leave this terminal open - Ollama needs to run in the background.
+
+**Tip:** Ollama usually starts automatically after installation. Check system tray for Ollama icon.
+
+---
+
+## Step 3: Pull a Model
+
+**Open a NEW terminal** and pull a model:
+
+**Recommended for most users:**
+```powershell
+ollama pull llama3.1:8b
+```
+Size: 4.7GB | Speed: Fast | Quality: Good
+
+**Best for code:**
+```powershell
+ollama pull qwen2.5-coder:7b
+```
+Size: 4.7GB | Speed: Fast | Quality: Excellent for code
+
+**Alternative options:**
+```powershell
+# Faster, smaller
+ollama pull mistral:7b        # 4.1GB
+
+# Better quality, larger
+ollama pull llama3.1:70b      # 40GB (requires good GPU)
+
+# Code-focused
+ollama pull codellama:13b     # 7.4GB
+```
+
+**Verify model is available:**
+```powershell
+ollama list
+```
+
+---
+
+## Step 4: Test Ollama
+
+```powershell
+ollama run llama3.1:8b "Explain what MCP is in one sentence"
+```
+
+Expected: You should get a response from the model.
+
+Press `Ctrl+D` or type `/bye` to exit the chat.
+
+---
+
+## Step 5: Setup MCP Server
+
+**Run the setup script:**
+```powershell
+cd D:\ClaudeTools\mcp-servers\ollama-assistant
+.\setup.ps1
+```
+
+This will:
+- Create Python virtual environment
+- Install MCP dependencies (mcp, httpx)
+- Check Ollama installation
+- Verify everything is configured
+
+**Expected output:**
+```
+[OK] Python installed
+[OK] Virtual environment created
+[OK] Dependencies installed
+[OK] Ollama installed
+[OK] Ollama server is running
+[OK] Found compatible models
+Setup Complete!
+```
+
+---
+
+## Step 6: Configure Claude Code
+
+The `.mcp.json` file has already been updated with the Ollama configuration.
+
+**Verify configuration:**
+```powershell
+cat D:\ClaudeTools\.mcp.json
+```
+
+You should see an `ollama-assistant` entry.
+
+---
+
+## Step 7: Restart Claude Code
+
+**IMPORTANT:** You must completely restart Claude Code for MCP changes to take effect.
+
+1. Close Claude Code completely
+2. Reopen Claude Code
+3. Navigate to D:\ClaudeTools directory
+
+---
+
+## Step 8: Test Integration
+
+Try these commands in Claude Code:
+
+**Test 1: Check status**
+```
+Use the ollama_status tool to check if Ollama is running
+```
+
+**Test 2: Ask a question**
+```
+Use ask_ollama to ask: "What is the fastest sorting algorithm?"
+```
+
+**Test 3: Analyze code**
+```
+Use analyze_code_local to review this Python function for bugs:
+def divide(a, b):
+    return a / b
+```
+
+---
+
+## Troubleshooting
+
+### Ollama Not Running
+
+**Error:** `Cannot connect to Ollama at http://localhost:11434`
+
+**Fix:**
+```powershell
+# Start Ollama
+ollama serve
+
+# Or check if it's already running
+netstat -ano | findstr :11434
+```
+
+### Model Not Found
+
+**Error:** `Model 'llama3.1:8b' not found`
+
+**Fix:**
+```powershell
+# Pull the model
+ollama pull llama3.1:8b
+
+# Verify it's installed
+ollama list
+```
+
+### Python Virtual Environment Issues
+
+**Error:** `python: command not found`
+
+**Fix:**
+1. Install Python 3.8+ from python.org
+2. Add Python to PATH
+3. Rerun setup.ps1
+
+### MCP Server Not Loading
+
+**Check Claude Code logs:**
+```powershell
+# Look for MCP-related errors
+# Logs are typically in: %APPDATA%\Claude\logs\
+```
+
+**Verify Python path:**
+```powershell
+D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version
+```
+
+### Port 11434 Already in Use
+
+**Error:** `Port 11434 is already in use`
+
+**Fix:**
+```powershell
+# Find what's using the port
+netstat -ano | findstr :11434
+
+# Kill the process (replace PID)
+taskkill /F /PID <PID>
+
+# Restart Ollama
+ollama serve
+```
+
+---
+
+## Performance Tips
+
+### GPU Acceleration
+
+**Ollama automatically uses your GPU if available (NVIDIA/AMD).**
+
+**Check GPU usage:**
+```powershell
+# NVIDIA
+nvidia-smi
+
+# AMD
+# Check Task Manager > Performance > GPU
+```
+
+### CPU Performance
+
+If using CPU only:
+- Smaller models (7b-8b) work better
+- Expect 2-5 tokens/second
+- Close other applications for better performance
+
+### Faster Response Times
+
+```powershell
+# Use smaller models for speed
+ollama pull mistral:7b
+
+# Or quantized versions (smaller, faster)
+ollama pull llama3.1:8b-q4_0
+```
+
+---
+
+## Usage Examples
+
+### Example 1: Private Code Review
+
+```
+I have some proprietary code I don't want to send to external APIs.
+Can you use the local Ollama model to review it for security issues?
+
+[Paste code]
+```
+
+Claude will use `analyze_code_local` to review locally.
+
+### Example 2: Large File Summary
+
+```
+Summarize this 50,000 line log file using the local model to avoid API costs.
+
+[Paste content]
+```
+
+Claude will use `summarize_large_file` locally.
+
+### Example 3: Offline Development
+
+```
+I'm offline - can you still help with this code?
+```
+
+Claude will delegate to local Ollama model automatically.
+
+---
+
+## What Models to Use When
+
+| Task | Best Model | Why |
+|------|-----------|-----|
+| Code review | qwen2.5-coder:7b | Trained specifically for code |
+| Code generation | codellama:13b | Best code completion |
+| General questions | llama3.1:8b | Balanced performance |
+| Speed priority | mistral:7b | Fastest responses |
+| Quality priority | llama3.1:70b | Best reasoning (needs GPU) |
+
+---
+
+## Uninstall
+
+To remove the Ollama MCP server:
+
+1. **Remove from `.mcp.json`:**
+   Delete the `ollama-assistant` entry
+
+2. **Delete files:**
+   ```powershell
+   Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant
+   ```
+
+3. **Uninstall Ollama (optional):**
+   ```powershell
+   winget uninstall Ollama.Ollama
+   ```
+
+4. **Restart Claude Code**
+
+---
+
+## Next Steps
+
+Once installed:
+1. Try asking me to use local Ollama for tasks
+2. I'll automatically delegate when appropriate:
+   - Privacy-sensitive code
+   - Large files
+   - Offline work
+   - Cost optimization
+
+The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.
+
+---
+
+**Status:** Ready to install
+**Estimated Setup Time:** 10-15 minutes (including model download)
+**Disk Space Required:** ~5-10GB (for models)