sync: Auto-sync from ACG-M-L5090 at 2026-01-22 19:22:24

Synced files: - Grepai optimization documentation - Ollama Assistant MCP server implementation - Session logs and context updates Machine: ACG-M-L5090 Timestamp: 2026-01-22 19:22:24 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-22 19:22:54 -07:00
parent 63ab144c8f
commit eca8fe820e
7 changed files with 1782 additions and 0 deletions
--- a/mcp-servers/ollama-assistant/INSTALL.md
+++ b/mcp-servers/ollama-assistant/INSTALL.md
@@ -0,0 +1,345 @@
+# Ollama MCP Server Installation Guide
+
+Follow these steps to set up local AI assistance for Claude Code.
+
+---
+
+## Step 1: Install Ollama
+
+**Option A: Using winget (Recommended)**
+```powershell
+winget install Ollama.Ollama
+```
+
+**Option B: Manual Download**
+1. Go to https://ollama.ai/download
+2. Download the Windows installer
+3. Run the installer
+
+**Verify Installation:**
+```powershell
+ollama --version
+```
+
+Expected output: `ollama version is X.Y.Z`
+
+---
+
+## Step 2: Start Ollama Server
+
+**Start the server:**
+```powershell
+ollama serve
+```
+
+Leave this terminal open - Ollama needs to run in the background.
+
+**Tip:** Ollama usually starts automatically after installation. Check system tray for Ollama icon.
+
+---
+
+## Step 3: Pull a Model
+
+**Open a NEW terminal** and pull a model:
+
+**Recommended for most users:**
+```powershell
+ollama pull llama3.1:8b
+```
+Size: 4.7GB | Speed: Fast | Quality: Good
+
+**Best for code:**
+```powershell
+ollama pull qwen2.5-coder:7b
+```
+Size: 4.7GB | Speed: Fast | Quality: Excellent for code
+
+**Alternative options:**
+```powershell
+# Faster, smaller
+ollama pull mistral:7b        # 4.1GB
+
+# Better quality, larger
+ollama pull llama3.1:70b      # 40GB (requires good GPU)
+
+# Code-focused
+ollama pull codellama:13b     # 7.4GB
+```
+
+**Verify model is available:**
+```powershell
+ollama list
+```
+
+---
+
+## Step 4: Test Ollama
+
+```powershell
+ollama run llama3.1:8b "Explain what MCP is in one sentence"
+```
+
+Expected: You should get a response from the model.
+
+Press `Ctrl+D` or type `/bye` to exit the chat.
+
+---
+
+## Step 5: Setup MCP Server
+
+**Run the setup script:**
+```powershell
+cd D:\ClaudeTools\mcp-servers\ollama-assistant
+.\setup.ps1
+```
+
+This will:
+- Create Python virtual environment
+- Install MCP dependencies (mcp, httpx)
+- Check Ollama installation
+- Verify everything is configured
+
+**Expected output:**
+```
+[OK] Python installed
+[OK] Virtual environment created
+[OK] Dependencies installed
+[OK] Ollama installed
+[OK] Ollama server is running
+[OK] Found compatible models
+Setup Complete!
+```
+
+---
+
+## Step 6: Configure Claude Code
+
+The `.mcp.json` file has already been updated with the Ollama configuration.
+
+**Verify configuration:**
+```powershell
+cat D:\ClaudeTools\.mcp.json
+```
+
+You should see an `ollama-assistant` entry.
+
+---
+
+## Step 7: Restart Claude Code
+
+**IMPORTANT:** You must completely restart Claude Code for MCP changes to take effect.
+
+1. Close Claude Code completely
+2. Reopen Claude Code
+3. Navigate to D:\ClaudeTools directory
+
+---
+
+## Step 8: Test Integration
+
+Try these commands in Claude Code:
+
+**Test 1: Check status**
+```
+Use the ollama_status tool to check if Ollama is running
+```
+
+**Test 2: Ask a question**
+```
+Use ask_ollama to ask: "What is the fastest sorting algorithm?"
+```
+
+**Test 3: Analyze code**
+```
+Use analyze_code_local to review this Python function for bugs:
+def divide(a, b):
+    return a / b
+```
+
+---
+
+## Troubleshooting
+
+### Ollama Not Running
+
+**Error:** `Cannot connect to Ollama at http://localhost:11434`
+
+**Fix:**
+```powershell
+# Start Ollama
+ollama serve
+
+# Or check if it's already running
+netstat -ano | findstr :11434
+```
+
+### Model Not Found
+
+**Error:** `Model 'llama3.1:8b' not found`
+
+**Fix:**
+```powershell
+# Pull the model
+ollama pull llama3.1:8b
+
+# Verify it's installed
+ollama list
+```
+
+### Python Virtual Environment Issues
+
+**Error:** `python: command not found`
+
+**Fix:**
+1. Install Python 3.8+ from python.org
+2. Add Python to PATH
+3. Rerun setup.ps1
+
+### MCP Server Not Loading
+
+**Check Claude Code logs:**
+```powershell
+# Look for MCP-related errors
+# Logs are typically in: %APPDATA%\Claude\logs\
+```
+
+**Verify Python path:**
+```powershell
+D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version
+```
+
+### Port 11434 Already in Use
+
+**Error:** `Port 11434 is already in use`
+
+**Fix:**
+```powershell
+# Find what's using the port
+netstat -ano | findstr :11434
+
+# Kill the process (replace PID)
+taskkill /F /PID <PID>
+
+# Restart Ollama
+ollama serve
+```
+
+---
+
+## Performance Tips
+
+### GPU Acceleration
+
+**Ollama automatically uses your GPU if available (NVIDIA/AMD).**
+
+**Check GPU usage:**
+```powershell
+# NVIDIA
+nvidia-smi
+
+# AMD
+# Check Task Manager > Performance > GPU
+```
+
+### CPU Performance
+
+If using CPU only:
+- Smaller models (7b-8b) work better
+- Expect 2-5 tokens/second
+- Close other applications for better performance
+
+### Faster Response Times
+
+```powershell
+# Use smaller models for speed
+ollama pull mistral:7b
+
+# Or quantized versions (smaller, faster)
+ollama pull llama3.1:8b-q4_0
+```
+
+---
+
+## Usage Examples
+
+### Example 1: Private Code Review
+
+```
+I have some proprietary code I don't want to send to external APIs.
+Can you use the local Ollama model to review it for security issues?
+
+[Paste code]
+```
+
+Claude will use `analyze_code_local` to review locally.
+
+### Example 2: Large File Summary
+
+```
+Summarize this 50,000 line log file using the local model to avoid API costs.
+
+[Paste content]
+```
+
+Claude will use `summarize_large_file` locally.
+
+### Example 3: Offline Development
+
+```
+I'm offline - can you still help with this code?
+```
+
+Claude will delegate to local Ollama model automatically.
+
+---
+
+## What Models to Use When
+
+| Task | Best Model | Why |
+|------|-----------|-----|
+| Code review | qwen2.5-coder:7b | Trained specifically for code |
+| Code generation | codellama:13b | Best code completion |
+| General questions | llama3.1:8b | Balanced performance |
+| Speed priority | mistral:7b | Fastest responses |
+| Quality priority | llama3.1:70b | Best reasoning (needs GPU) |
+
+---
+
+## Uninstall
+
+To remove the Ollama MCP server:
+
+1. **Remove from `.mcp.json`:**
+   Delete the `ollama-assistant` entry
+
+2. **Delete files:**
+   ```powershell
+   Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant
+   ```
+
+3. **Uninstall Ollama (optional):**
+   ```powershell
+   winget uninstall Ollama.Ollama
+   ```
+
+4. **Restart Claude Code**
+
+---
+
+## Next Steps
+
+Once installed:
+1. Try asking me to use local Ollama for tasks
+2. I'll automatically delegate when appropriate:
+   - Privacy-sensitive code
+   - Large files
+   - Offline work
+   - Cost optimization
+
+The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.
+
+---
+
+**Status:** Ready to install
+**Estimated Setup Time:** 10-15 minutes (including model download)
+**Disk Space Required:** ~5-10GB (for models)
--- a/mcp-servers/ollama-assistant/README.md
+++ b/mcp-servers/ollama-assistant/README.md
@@ -0,0 +1,413 @@
+# Ollama MCP Server - Local AI Assistant
+
+**Purpose:** Integrate Ollama local models with Claude Code via MCP, allowing Claude to delegate tasks to a local model that has computer access.
+
+## Use Cases
+
+- **Code Analysis:** Delegate code review to local model for privacy-sensitive code
+- **Data Processing:** Process large local datasets without API costs
+- **Offline Work:** Continue working when internet/API is unavailable
+- **Cost Optimization:** Use local model for simple tasks, Claude for complex reasoning
+
+---
+
+## Architecture
+
+```
+┌─────────────────┐
+│  Claude Code    │ (Coordinator)
+└────────┬────────┘
+         │
+         │ MCP Protocol
+         ↓
+┌─────────────────────────────┐
+│  Ollama MCP Server          │
+│  - Exposes tools:           │
+│    • ask_ollama()           │
+│    • analyze_code()         │
+│    • process_data()         │
+└────────┬────────────────────┘
+         │
+         │ HTTP API
+         ↓
+┌─────────────────────────────┐
+│  Ollama                     │
+│  - Model: llama3.1:8b       │
+│  - Local execution          │
+└─────────────────────────────┘
+```
+
+---
+
+## Installation
+
+### 1. Install Ollama
+
+**Windows:**
+```powershell
+# Download from https://ollama.ai/download
+# Or use winget
+winget install Ollama.Ollama
+```
+
+**Verify Installation:**
+```bash
+ollama --version
+```
+
+### 2. Pull a Model
+
+```bash
+# Recommended models:
+ollama pull llama3.1:8b      # Best balance (4.7GB)
+ollama pull codellama:13b    # Code-focused (7.4GB)
+ollama pull mistral:7b       # Fast, good reasoning (4.1GB)
+ollama pull qwen2.5-coder:7b # Excellent for code (4.7GB)
+```
+
+### 3. Test Ollama
+
+```bash
+ollama run llama3.1:8b "What is MCP?"
+```
+
+### 4. Create MCP Server
+
+**File:** `mcp-servers/ollama-assistant/server.py`
+
+```python
+#!/usr/bin/env python3
+"""
+Ollama MCP Server
+Provides local AI assistance to Claude Code via MCP protocol
+"""
+
+import asyncio
+import json
+from typing import Any
+import httpx
+from mcp.server import Server
+from mcp.types import Tool, TextContent
+
+# Configuration
+OLLAMA_HOST = "http://localhost:11434"
+DEFAULT_MODEL = "llama3.1:8b"
+
+# Create MCP server
+app = Server("ollama-assistant")
+
+@app.list_tools()
+async def list_tools() -> list[Tool]:
+    """List available Ollama tools"""
+    return [
+        Tool(
+            name="ask_ollama",
+            description="Ask the local Ollama model a question. Use for simple queries, code review, or when you want a second opinion. The model has no context of the conversation.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "prompt": {
+                        "type": "string",
+                        "description": "The question or task for Ollama"
+                    },
+                    "model": {
+                        "type": "string",
+                        "description": "Model to use (default: llama3.1:8b)",
+                        "default": DEFAULT_MODEL
+                    },
+                    "system": {
+                        "type": "string",
+                        "description": "System prompt to set context/role",
+                        "default": "You are a helpful coding assistant."
+                    }
+                },
+                "required": ["prompt"]
+            }
+        ),
+        Tool(
+            name="analyze_code_local",
+            description="Analyze code using local Ollama model. Good for privacy-sensitive code or large codebases. Returns analysis without sending code to external APIs.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "code": {
+                        "type": "string",
+                        "description": "Code to analyze"
+                    },
+                    "language": {
+                        "type": "string",
+                        "description": "Programming language"
+                    },
+                    "analysis_type": {
+                        "type": "string",
+                        "enum": ["security", "performance", "quality", "bugs", "general"],
+                        "description": "Type of analysis to perform"
+                    }
+                },
+                "required": ["code", "language"]
+            }
+        ),
+        Tool(
+            name="summarize_large_file",
+            description="Summarize large files using local model. No size limits or API costs.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "content": {
+                        "type": "string",
+                        "description": "File content to summarize"
+                    },
+                    "summary_length": {
+                        "type": "string",
+                        "enum": ["brief", "detailed", "technical"],
+                        "default": "brief"
+                    }
+                },
+                "required": ["content"]
+            }
+        )
+    ]
+
+@app.call_tool()
+async def call_tool(name: str, arguments: Any) -> list[TextContent]:
+    """Execute Ollama tool"""
+
+    if name == "ask_ollama":
+        prompt = arguments["prompt"]
+        model = arguments.get("model", DEFAULT_MODEL)
+        system = arguments.get("system", "You are a helpful coding assistant.")
+
+        response = await query_ollama(prompt, model, system)
+        return [TextContent(type="text", text=response)]
+
+    elif name == "analyze_code_local":
+        code = arguments["code"]
+        language = arguments["language"]
+        analysis_type = arguments.get("analysis_type", "general")
+
+        system = f"You are a {language} code analyzer. Focus on {analysis_type} analysis."
+        prompt = f"Analyze this {language} code:\n\n```{language}\n{code}\n```\n\nProvide a {analysis_type} analysis."
+
+        response = await query_ollama(prompt, "codellama:13b", system)
+        return [TextContent(type="text", text=response)]
+
+    elif name == "summarize_large_file":
+        content = arguments["content"]
+        summary_length = arguments.get("summary_length", "brief")
+
+        system = f"You are a file summarizer. Create {summary_length} summaries."
+        prompt = f"Summarize this file content:\n\n{content}"
+
+        response = await query_ollama(prompt, DEFAULT_MODEL, system)
+        return [TextContent(type="text", text=response)]
+
+    else:
+        raise ValueError(f"Unknown tool: {name}")
+
+async def query_ollama(prompt: str, model: str, system: str) -> str:
+    """Query Ollama API"""
+    async with httpx.AsyncClient(timeout=120.0) as client:
+        response = await client.post(
+            f"{OLLAMA_HOST}/api/generate",
+            json={
+                "model": model,
+                "prompt": prompt,
+                "system": system,
+                "stream": False
+            }
+        )
+        response.raise_for_status()
+        result = response.json()
+        return result["response"]
+
+async def main():
+    """Run MCP server"""
+    from mcp.server.stdio import stdio_server
+
+    async with stdio_server() as (read_stream, write_stream):
+        await app.run(
+            read_stream,
+            write_stream,
+            app.create_initialization_options()
+        )
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+### 5. Install MCP Server Dependencies
+
+```bash
+cd D:\ClaudeTools\mcp-servers\ollama-assistant
+python -m venv venv
+venv\Scripts\activate
+pip install mcp httpx
+```
+
+### 6. Configure in Claude Code
+
+**Edit:** `.mcp.json` (in D:\ClaudeTools)
+
+```json
+{
+  "mcpServers": {
+    "github": {
+      "command": "npx",
+      "args": ["-y", "@modelcontextprotocol/server-github"],
+      "env": {
+        "GITHUB_PERSONAL_ACCESS_TOKEN": "your-token-here"
+      }
+    },
+    "filesystem": {
+      "command": "npx",
+      "args": ["-y", "@modelcontextprotocol/server-filesystem", "D:\\ClaudeTools"]
+    },
+    "sequential-thinking": {
+      "command": "npx",
+      "args": ["-y", "@modelcontextprotocol/server-sequential-thinking"]
+    },
+    "ollama-assistant": {
+      "command": "python",
+      "args": [
+        "D:\\ClaudeTools\\mcp-servers\\ollama-assistant\\venv\\Scripts\\python.exe",
+        "D:\\ClaudeTools\\mcp-servers\\ollama-assistant\\server.py"
+      ]
+    }
+  }
+}
+```
+
+---
+
+## Usage Examples
+
+### Example 1: Ask Ollama for a Second Opinion
+
+```
+User: "Review this authentication code for security issues"
+
+Claude: Let me delegate this to the local Ollama model for a privacy-focused review.
+
+[Uses ask_ollama tool]
+Ollama: "Found potential issues: 1. Password not hashed... 2. No rate limiting..."
+
+Claude: Based on the local analysis, here are the security concerns...
+```
+
+### Example 2: Analyze Large Codebase Locally
+
+```
+User: "Analyze this 10,000 line file for performance issues"
+
+Claude: This is large - I'll use the local model to avoid API costs.
+
+[Uses summarize_large_file tool]
+Ollama: "Main performance bottlenecks: 1. N+1 queries... 2. Missing indexes..."
+
+Claude: Here's the performance analysis from the local model...
+```
+
+### Example 3: Offline Development
+
+```
+User: "Help me debug this code" (while offline)
+
+Claude: API unavailable, using local Ollama model...
+
+[Uses analyze_code_local tool]
+Ollama: "Bug found on line 42: null reference..."
+
+Claude: The local model identified the issue...
+```
+
+---
+
+## Option 2: Standalone Ollama with MCP Tools
+
+Run Ollama as a separate agent with its own MCP server access.
+
+**Architecture:**
+```
+┌─────────────────┐     ┌─────────────────────┐
+│  Claude Code    │     │  Ollama + MCP       │
+│  (Main Agent)   │────▶│  (Helper Agent)     │
+└─────────────────┘     └──────────┬──────────┘
+                                   │
+                                   │ MCP Protocol
+                                   ↓
+                        ┌──────────────────────┐
+                        │  MCP Servers         │
+                        │  - Filesystem        │
+                        │  - Bash              │
+                        │  - Custom tools      │
+                        └──────────────────────┘
+```
+
+**Tool:** Use `ollama-mcp` or similar wrapper that gives Ollama access to MCP servers.
+
+---
+
+## Option 3: Hybrid Task Distribution
+
+Use Claude as coordinator, Ollama for execution.
+
+**When to use Ollama:**
+- Privacy-sensitive code review
+- Large file processing (no token limits)
+- Offline work
+- Cost optimization (simple tasks)
+- Repetitive analysis
+
+**When to use Claude:**
+- Complex reasoning
+- Multi-step planning
+- API integrations
+- Final decision-making
+- User communication
+
+---
+
+## Recommended Models for Different Tasks
+
+| Task Type | Recommended Model | Size | Reason |
+|-----------|------------------|------|--------|
+| Code Review | qwen2.5-coder:7b | 4.7GB | Best code understanding |
+| Code Generation | codellama:13b | 7.4GB | Trained on code |
+| General Queries | llama3.1:8b | 4.7GB | Balanced performance |
+| Fast Responses | mistral:7b | 4.1GB | Speed optimized |
+| Large Context | llama3.1:70b | 40GB | 128k context (needs GPU) |
+
+---
+
+## Performance Considerations
+
+**CPU Only:**
+- llama3.1:8b: ~2-5 tokens/sec
+- Usable for short queries
+
+**GPU (NVIDIA):**
+- llama3.1:8b: ~30-100 tokens/sec
+- codellama:13b: ~20-50 tokens/sec
+- Much faster, recommended
+
+**Enable GPU in Ollama:**
+```bash
+# Ollama auto-detects GPU
+# Verify: check Ollama logs for "CUDA" or "Metal"
+```
+
+---
+
+## Next Steps
+
+1. Install Ollama
+2. Pull a model (llama3.1:8b recommended)
+3. Create MCP server (use code above)
+4. Configure `.mcp.json`
+5. Restart Claude Code
+6. Test: "Use the local Ollama model to analyze this code"
+
+---
+
+**Status:** Design phase - ready to implement
+**Created:** 2026-01-22
--- a/mcp-servers/ollama-assistant/requirements.txt
+++ b/mcp-servers/ollama-assistant/requirements.txt
@@ -0,0 +1,7 @@
+# Ollama MCP Server Dependencies
+
+# MCP SDK
+mcp>=0.1.0
+
+# HTTP client for Ollama API
+httpx>=0.25.0
--- a/mcp-servers/ollama-assistant/server.py
+++ b/mcp-servers/ollama-assistant/server.py
@@ -0,0 +1,238 @@
+#!/usr/bin/env python3
+"""
+Ollama MCP Server
+Provides local AI assistance to Claude Code via MCP protocol
+"""
+
+import asyncio
+import json
+import sys
+from typing import Any
+import httpx
+
+# MCP imports
+try:
+    from mcp.server import Server
+    from mcp.types import Tool, TextContent
+except ImportError:
+    print("[ERROR] MCP package not installed. Run: pip install mcp", file=sys.stderr)
+    sys.exit(1)
+
+# Configuration
+OLLAMA_HOST = "http://localhost:11434"
+DEFAULT_MODEL = "llama3.1:8b"
+
+# Create MCP server
+app = Server("ollama-assistant")
+
+@app.list_tools()
+async def list_tools() -> list[Tool]:
+    """List available Ollama tools"""
+    return [
+        Tool(
+            name="ask_ollama",
+            description="Ask the local Ollama model a question. Use for simple queries, code review, or when you want a second opinion. The model has no context of the conversation.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "prompt": {
+                        "type": "string",
+                        "description": "The question or task for Ollama"
+                    },
+                    "model": {
+                        "type": "string",
+                        "description": "Model to use (default: llama3.1:8b)",
+                        "default": DEFAULT_MODEL
+                    },
+                    "system": {
+                        "type": "string",
+                        "description": "System prompt to set context/role",
+                        "default": "You are a helpful coding assistant."
+                    }
+                },
+                "required": ["prompt"]
+            }
+        ),
+        Tool(
+            name="analyze_code_local",
+            description="Analyze code using local Ollama model. Good for privacy-sensitive code or large codebases. Returns analysis without sending code to external APIs.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "code": {
+                        "type": "string",
+                        "description": "Code to analyze"
+                    },
+                    "language": {
+                        "type": "string",
+                        "description": "Programming language"
+                    },
+                    "analysis_type": {
+                        "type": "string",
+                        "enum": ["security", "performance", "quality", "bugs", "general"],
+                        "description": "Type of analysis to perform",
+                        "default": "general"
+                    }
+                },
+                "required": ["code", "language"]
+            }
+        ),
+        Tool(
+            name="summarize_large_file",
+            description="Summarize large files using local model. No size limits or API costs.",
+            inputSchema={
+                "type": "object",
+                "properties": {
+                    "content": {
+                        "type": "string",
+                        "description": "File content to summarize"
+                    },
+                    "summary_length": {
+                        "type": "string",
+                        "enum": ["brief", "detailed", "technical"],
+                        "default": "brief"
+                    }
+                },
+                "required": ["content"]
+            }
+        ),
+        Tool(
+            name="ollama_status",
+            description="Check Ollama server status and list available models",
+            inputSchema={
+                "type": "object",
+                "properties": {}
+            }
+        )
+    ]
+
+@app.call_tool()
+async def call_tool(name: str, arguments: Any) -> list[TextContent]:
+    """Execute Ollama tool"""
+
+    if name == "ask_ollama":
+        prompt = arguments["prompt"]
+        model = arguments.get("model", DEFAULT_MODEL)
+        system = arguments.get("system", "You are a helpful coding assistant.")
+
+        try:
+            response = await query_ollama(prompt, model, system)
+            return [TextContent(type="text", text=response)]
+        except Exception as e:
+            return [TextContent(type="text", text=f"[ERROR] Ollama query failed: {str(e)}")]
+
+    elif name == "analyze_code_local":
+        code = arguments["code"]
+        language = arguments["language"]
+        analysis_type = arguments.get("analysis_type", "general")
+
+        system = f"You are a {language} code analyzer. Focus on {analysis_type} analysis. Be concise and specific."
+        prompt = f"Analyze this {language} code for {analysis_type} issues:\n\n```{language}\n{code}\n```\n\nProvide specific findings with line references where possible."
+
+        # Try to use code-specific model if available, fallback to default
+        try:
+            response = await query_ollama(prompt, "qwen2.5-coder:7b", system)
+        except:
+            try:
+                response = await query_ollama(prompt, "codellama:13b", system)
+            except:
+                response = await query_ollama(prompt, DEFAULT_MODEL, system)
+
+        return [TextContent(type="text", text=response)]
+
+    elif name == "summarize_large_file":
+        content = arguments["content"]
+        summary_length = arguments.get("summary_length", "brief")
+
+        length_instructions = {
+            "brief": "Create a concise 2-3 sentence summary.",
+            "detailed": "Create a comprehensive paragraph summary covering main points.",
+            "technical": "Create a technical summary highlighting key functions, classes, and architecture."
+        }
+
+        system = f"You are a file summarizer. {length_instructions[summary_length]}"
+        prompt = f"Summarize this content:\n\n{content[:50000]}"  # Limit to first 50k chars
+
+        response = await query_ollama(prompt, DEFAULT_MODEL, system)
+        return [TextContent(type="text", text=response)]
+
+    elif name == "ollama_status":
+        try:
+            status = await check_ollama_status()
+            return [TextContent(type="text", text=status)]
+        except Exception as e:
+            return [TextContent(type="text", text=f"[ERROR] Failed to check Ollama status: {str(e)}")]
+
+    else:
+        raise ValueError(f"Unknown tool: {name}")
+
+async def query_ollama(prompt: str, model: str, system: str) -> str:
+    """Query Ollama API"""
+    async with httpx.AsyncClient(timeout=120.0) as client:
+        try:
+            response = await client.post(
+                f"{OLLAMA_HOST}/api/generate",
+                json={
+                    "model": model,
+                    "prompt": prompt,
+                    "system": system,
+                    "stream": False,
+                    "options": {
+                        "temperature": 0.7,
+                        "top_p": 0.9
+                    }
+                }
+            )
+            response.raise_for_status()
+            result = response.json()
+            return result["response"]
+        except httpx.ConnectError:
+            raise Exception(f"Cannot connect to Ollama at {OLLAMA_HOST}. Is Ollama running? Try: ollama serve")
+        except httpx.HTTPStatusError as e:
+            if e.response.status_code == 404:
+                raise Exception(f"Model '{model}' not found. Pull it with: ollama pull {model}")
+            raise Exception(f"Ollama API error: {e.response.status_code} - {e.response.text}")
+
+async def check_ollama_status() -> str:
+    """Check Ollama server status and list models"""
+    async with httpx.AsyncClient(timeout=10.0) as client:
+        try:
+            # Check server
+            await client.get(f"{OLLAMA_HOST}/")
+
+            # List models
+            response = await client.get(f"{OLLAMA_HOST}/api/tags")
+            response.raise_for_status()
+            models = response.json().get("models", [])
+
+            if not models:
+                return "[WARNING] Ollama is running but no models are installed. Pull a model with: ollama pull llama3.1:8b"
+
+            status = "[OK] Ollama is running\n\nAvailable models:\n"
+            for model in models:
+                name = model["name"]
+                size = model.get("size", 0) / (1024**3)  # Convert to GB
+                status += f"  - {name} ({size:.1f} GB)\n"
+
+            return status
+
+        except httpx.ConnectError:
+            return f"[ERROR] Ollama is not running. Start it with: ollama serve\nOr install from: https://ollama.ai/download"
+
+async def main():
+    """Run MCP server"""
+    try:
+        from mcp.server.stdio import stdio_server
+
+        async with stdio_server() as (read_stream, write_stream):
+            await app.run(
+                read_stream,
+                write_stream,
+                app.create_initialization_options()
+            )
+    except Exception as e:
+        print(f"[ERROR] MCP server failed: {e}", file=sys.stderr)
+        sys.exit(1)
+
+if __name__ == "__main__":
+    asyncio.run(main())
--- a/mcp-servers/ollama-assistant/setup.ps1
+++ b/mcp-servers/ollama-assistant/setup.ps1
@@ -0,0 +1,84 @@
+# Setup Ollama MCP Server
+# Run this script to install dependencies
+
+$ErrorActionPreference = "Stop"
+
+Write-Host "="*80 -ForegroundColor Cyan
+Write-Host "Ollama MCP Server Setup" -ForegroundColor Cyan
+Write-Host "="*80 -ForegroundColor Cyan
+Write-Host ""
+
+# Check if Python is available
+Write-Host "[INFO] Checking Python..." -ForegroundColor Cyan
+try {
+    $pythonVersion = python --version 2>&1
+    Write-Host "[OK] $pythonVersion" -ForegroundColor Green
+}
+catch {
+    Write-Host "[ERROR] Python not found. Install Python 3.8+ from python.org" -ForegroundColor Red
+    exit 1
+}
+
+# Create virtual environment
+Write-Host "[INFO] Creating virtual environment..." -ForegroundColor Cyan
+if (Test-Path "venv") {
+    Write-Host "[SKIP] Virtual environment already exists" -ForegroundColor Yellow
+}
+else {
+    python -m venv venv
+    Write-Host "[OK] Virtual environment created" -ForegroundColor Green
+}
+
+# Activate and install dependencies
+Write-Host "[INFO] Installing dependencies..." -ForegroundColor Cyan
+& "venv\Scripts\activate.ps1"
+python -m pip install --upgrade pip -q
+pip install -r requirements.txt
+
+Write-Host "[OK] Dependencies installed" -ForegroundColor Green
+Write-Host ""
+
+# Check Ollama installation
+Write-Host "[INFO] Checking Ollama installation..." -ForegroundColor Cyan
+try {
+    $ollamaVersion = ollama --version 2>&1
+    Write-Host "[OK] Ollama installed: $ollamaVersion" -ForegroundColor Green
+
+    # Check if Ollama is running
+    try {
+        $response = Invoke-WebRequest -Uri "http://localhost:11434" -Method GET -TimeoutSec 2 -ErrorAction Stop
+        Write-Host "[OK] Ollama server is running" -ForegroundColor Green
+    }
+    catch {
+        Write-Host "[WARNING] Ollama is installed but not running" -ForegroundColor Yellow
+        Write-Host "[INFO] Start Ollama with: ollama serve" -ForegroundColor Cyan
+    }
+
+    # Check for models
+    Write-Host "[INFO] Checking for installed models..." -ForegroundColor Cyan
+    $models = ollama list 2>&1
+    if ($models -match "llama3.1:8b|qwen2.5-coder|codellama") {
+        Write-Host "[OK] Found compatible models" -ForegroundColor Green
+    }
+    else {
+        Write-Host "[WARNING] No recommended models found" -ForegroundColor Yellow
+        Write-Host "[INFO] Pull a model with: ollama pull llama3.1:8b" -ForegroundColor Cyan
+    }
+}
+catch {
+    Write-Host "[WARNING] Ollama not installed" -ForegroundColor Yellow
+    Write-Host "[INFO] Install from: https://ollama.ai/download" -ForegroundColor Cyan
+    Write-Host "[INFO] Or run: winget install Ollama.Ollama" -ForegroundColor Cyan
+}
+
+Write-Host ""
+Write-Host "="*80 -ForegroundColor Cyan
+Write-Host "Setup Complete!" -ForegroundColor Green
+Write-Host "="*80 -ForegroundColor Cyan
+Write-Host ""
+Write-Host "Next steps:" -ForegroundColor Cyan
+Write-Host "1. Install Ollama if not already installed: winget install Ollama.Ollama"
+Write-Host "2. Pull a model: ollama pull llama3.1:8b"
+Write-Host "3. Start Ollama: ollama serve"
+Write-Host "4. Add to .mcp.json and restart Claude Code"
+Write-Host ""