claudetools/mcp-servers/ollama-assistant/INSTALL.md

# Ollama MCP Server Installation Guide

Follow these steps to set up local AI assistance for Claude Code.

---

## Step 1: Install Ollama

**Option A: Using winget (Recommended)**
```powershell
winget install Ollama.Ollama
```

**Option B: Manual Download**
1. Go to https://ollama.ai/download
2. Download the Windows installer
3. Run the installer

**Verify Installation:**
```powershell
ollama --version
```

Expected output: `ollama version is X.Y.Z`

---

## Step 2: Start Ollama Server

**Start the server:**
```powershell
ollama serve
```

Leave this terminal open - Ollama needs to run in the background.

**Tip:** Ollama usually starts automatically after installation. Check system tray for Ollama icon.

---

## Step 3: Pull a Model

**Open a NEW terminal** and pull a model:

**Recommended for most users:**
```powershell
ollama pull llama3.1:8b
```
Size: 4.7GB | Speed: Fast | Quality: Good

**Best for code:**
```powershell
ollama pull qwen2.5-coder:7b
```
Size: 4.7GB | Speed: Fast | Quality: Excellent for code

**Alternative options:**
```powershell
# Faster, smaller
ollama pull mistral:7b        # 4.1GB

# Better quality, larger
ollama pull llama3.1:70b      # 40GB (requires good GPU)

# Code-focused
ollama pull codellama:13b     # 7.4GB
```

**Verify model is available:**
```powershell
ollama list
```

---

## Step 4: Test Ollama

```powershell
ollama run llama3.1:8b "Explain what MCP is in one sentence"
```

Expected: You should get a response from the model.

Press `Ctrl+D` or type `/bye` to exit the chat.

---

## Step 5: Setup MCP Server

**Run the setup script:**
```powershell
cd D:\ClaudeTools\mcp-servers\ollama-assistant
.\setup.ps1
```

This will:
- Create Python virtual environment
- Install MCP dependencies (mcp, httpx)
- Check Ollama installation
- Verify everything is configured

**Expected output:**
```
[OK] Python installed
[OK] Virtual environment created
[OK] Dependencies installed
[OK] Ollama installed
[OK] Ollama server is running
[OK] Found compatible models
Setup Complete!
```

---

## Step 6: Configure Claude Code

The `.mcp.json` file has already been updated with the Ollama configuration.

**Verify configuration:**
```powershell
cat D:\ClaudeTools\.mcp.json
```

You should see an `ollama-assistant` entry.

---

## Step 7: Restart Claude Code

**IMPORTANT:** You must completely restart Claude Code for MCP changes to take effect.

1. Close Claude Code completely
2. Reopen Claude Code
3. Navigate to D:\ClaudeTools directory

---

## Step 8: Test Integration

Try these commands in Claude Code:

**Test 1: Check status**
```
Use the ollama_status tool to check if Ollama is running
```

**Test 2: Ask a question**
```
Use ask_ollama to ask: "What is the fastest sorting algorithm?"
```

**Test 3: Analyze code**
```
Use analyze_code_local to review this Python function for bugs:
def divide(a, b):
    return a / b
```

---

## Troubleshooting

### Ollama Not Running

**Error:** `Cannot connect to Ollama at http://localhost:11434`

**Fix:**
```powershell
# Start Ollama
ollama serve

# Or check if it's already running
netstat -ano | findstr :11434
```

### Model Not Found

**Error:** `Model 'llama3.1:8b' not found`

**Fix:**
```powershell
# Pull the model
ollama pull llama3.1:8b

# Verify it's installed
ollama list
```

### Python Virtual Environment Issues

**Error:** `python: command not found`

**Fix:**
1. Install Python 3.8+ from python.org
2. Add Python to PATH
3. Rerun setup.ps1

### MCP Server Not Loading

**Check Claude Code logs:**
```powershell
# Look for MCP-related errors
# Logs are typically in: %APPDATA%\Claude\logs\
```

**Verify Python path:**
```powershell
D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version
```

### Port 11434 Already in Use

**Error:** `Port 11434 is already in use`

**Fix:**
```powershell
# Find what's using the port
netstat -ano | findstr :11434

# Kill the process (replace PID)
taskkill /F /PID <PID>

# Restart Ollama
ollama serve
```

---

## Performance Tips

### GPU Acceleration

**Ollama automatically uses your GPU if available (NVIDIA/AMD).**

**Check GPU usage:**
```powershell
# NVIDIA
nvidia-smi

# AMD
# Check Task Manager > Performance > GPU
```

### CPU Performance

If using CPU only:
- Smaller models (7b-8b) work better
- Expect 2-5 tokens/second
- Close other applications for better performance

### Faster Response Times

```powershell
# Use smaller models for speed
ollama pull mistral:7b

# Or quantized versions (smaller, faster)
ollama pull llama3.1:8b-q4_0
```

---

## Usage Examples

### Example 1: Private Code Review

```
I have some proprietary code I don't want to send to external APIs.
Can you use the local Ollama model to review it for security issues?

[Paste code]
```

Claude will use `analyze_code_local` to review locally.

### Example 2: Large File Summary

```
Summarize this 50,000 line log file using the local model to avoid API costs.

[Paste content]
```

Claude will use `summarize_large_file` locally.

### Example 3: Offline Development

```
I'm offline - can you still help with this code?
```

Claude will delegate to local Ollama model automatically.

---

## What Models to Use When

| Task | Best Model | Why |
|------|-----------|-----|
| Code review | qwen2.5-coder:7b | Trained specifically for code |
| Code generation | codellama:13b | Best code completion |
| General questions | llama3.1:8b | Balanced performance |
| Speed priority | mistral:7b | Fastest responses |
| Quality priority | llama3.1:70b | Best reasoning (needs GPU) |

---

## Uninstall

To remove the Ollama MCP server:

1. **Remove from `.mcp.json`:**
   Delete the `ollama-assistant` entry

2. **Delete files:**
   ```powershell
   Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant
   ```

3. **Uninstall Ollama (optional):**
   ```powershell
   winget uninstall Ollama.Ollama
   ```

4. **Restart Claude Code**

---

## Next Steps

Once installed:
1. Try asking me to use local Ollama for tasks
2. I'll automatically delegate when appropriate:
   - Privacy-sensitive code
   - Large files
   - Offline work
   - Cost optimization

The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.

---

**Status:** Ready to install
**Estimated Setup Time:** 10-15 minutes (including model download)
**Disk Space Required:** ~5-10GB (for models)