Synced files: - Grepai optimization documentation - Ollama Assistant MCP server implementation - Session logs and context updates Machine: ACG-M-L5090 Timestamp: 2026-01-22 19:22:24 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
346 lines
6.2 KiB
Markdown
346 lines
6.2 KiB
Markdown
# Ollama MCP Server Installation Guide
|
|
|
|
Follow these steps to set up local AI assistance for Claude Code.
|
|
|
|
---
|
|
|
|
## Step 1: Install Ollama
|
|
|
|
**Option A: Using winget (Recommended)**
|
|
```powershell
|
|
winget install Ollama.Ollama
|
|
```
|
|
|
|
**Option B: Manual Download**
|
|
1. Go to https://ollama.ai/download
|
|
2. Download the Windows installer
|
|
3. Run the installer
|
|
|
|
**Verify Installation:**
|
|
```powershell
|
|
ollama --version
|
|
```
|
|
|
|
Expected output: `ollama version is X.Y.Z`
|
|
|
|
---
|
|
|
|
## Step 2: Start Ollama Server
|
|
|
|
**Start the server:**
|
|
```powershell
|
|
ollama serve
|
|
```
|
|
|
|
Leave this terminal open - Ollama needs to run in the background.
|
|
|
|
**Tip:** Ollama usually starts automatically after installation. Check system tray for Ollama icon.
|
|
|
|
---
|
|
|
|
## Step 3: Pull a Model
|
|
|
|
**Open a NEW terminal** and pull a model:
|
|
|
|
**Recommended for most users:**
|
|
```powershell
|
|
ollama pull llama3.1:8b
|
|
```
|
|
Size: 4.7GB | Speed: Fast | Quality: Good
|
|
|
|
**Best for code:**
|
|
```powershell
|
|
ollama pull qwen2.5-coder:7b
|
|
```
|
|
Size: 4.7GB | Speed: Fast | Quality: Excellent for code
|
|
|
|
**Alternative options:**
|
|
```powershell
|
|
# Faster, smaller
|
|
ollama pull mistral:7b # 4.1GB
|
|
|
|
# Better quality, larger
|
|
ollama pull llama3.1:70b # 40GB (requires good GPU)
|
|
|
|
# Code-focused
|
|
ollama pull codellama:13b # 7.4GB
|
|
```
|
|
|
|
**Verify model is available:**
|
|
```powershell
|
|
ollama list
|
|
```
|
|
|
|
---
|
|
|
|
## Step 4: Test Ollama
|
|
|
|
```powershell
|
|
ollama run llama3.1:8b "Explain what MCP is in one sentence"
|
|
```
|
|
|
|
Expected: You should get a response from the model.
|
|
|
|
Press `Ctrl+D` or type `/bye` to exit the chat.
|
|
|
|
---
|
|
|
|
## Step 5: Setup MCP Server
|
|
|
|
**Run the setup script:**
|
|
```powershell
|
|
cd D:\ClaudeTools\mcp-servers\ollama-assistant
|
|
.\setup.ps1
|
|
```
|
|
|
|
This will:
|
|
- Create Python virtual environment
|
|
- Install MCP dependencies (mcp, httpx)
|
|
- Check Ollama installation
|
|
- Verify everything is configured
|
|
|
|
**Expected output:**
|
|
```
|
|
[OK] Python installed
|
|
[OK] Virtual environment created
|
|
[OK] Dependencies installed
|
|
[OK] Ollama installed
|
|
[OK] Ollama server is running
|
|
[OK] Found compatible models
|
|
Setup Complete!
|
|
```
|
|
|
|
---
|
|
|
|
## Step 6: Configure Claude Code
|
|
|
|
The `.mcp.json` file has already been updated with the Ollama configuration.
|
|
|
|
**Verify configuration:**
|
|
```powershell
|
|
cat D:\ClaudeTools\.mcp.json
|
|
```
|
|
|
|
You should see an `ollama-assistant` entry.
|
|
|
|
---
|
|
|
|
## Step 7: Restart Claude Code
|
|
|
|
**IMPORTANT:** You must completely restart Claude Code for MCP changes to take effect.
|
|
|
|
1. Close Claude Code completely
|
|
2. Reopen Claude Code
|
|
3. Navigate to D:\ClaudeTools directory
|
|
|
|
---
|
|
|
|
## Step 8: Test Integration
|
|
|
|
Try these commands in Claude Code:
|
|
|
|
**Test 1: Check status**
|
|
```
|
|
Use the ollama_status tool to check if Ollama is running
|
|
```
|
|
|
|
**Test 2: Ask a question**
|
|
```
|
|
Use ask_ollama to ask: "What is the fastest sorting algorithm?"
|
|
```
|
|
|
|
**Test 3: Analyze code**
|
|
```
|
|
Use analyze_code_local to review this Python function for bugs:
|
|
def divide(a, b):
|
|
return a / b
|
|
```
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Ollama Not Running
|
|
|
|
**Error:** `Cannot connect to Ollama at http://localhost:11434`
|
|
|
|
**Fix:**
|
|
```powershell
|
|
# Start Ollama
|
|
ollama serve
|
|
|
|
# Or check if it's already running
|
|
netstat -ano | findstr :11434
|
|
```
|
|
|
|
### Model Not Found
|
|
|
|
**Error:** `Model 'llama3.1:8b' not found`
|
|
|
|
**Fix:**
|
|
```powershell
|
|
# Pull the model
|
|
ollama pull llama3.1:8b
|
|
|
|
# Verify it's installed
|
|
ollama list
|
|
```
|
|
|
|
### Python Virtual Environment Issues
|
|
|
|
**Error:** `python: command not found`
|
|
|
|
**Fix:**
|
|
1. Install Python 3.8+ from python.org
|
|
2. Add Python to PATH
|
|
3. Rerun setup.ps1
|
|
|
|
### MCP Server Not Loading
|
|
|
|
**Check Claude Code logs:**
|
|
```powershell
|
|
# Look for MCP-related errors
|
|
# Logs are typically in: %APPDATA%\Claude\logs\
|
|
```
|
|
|
|
**Verify Python path:**
|
|
```powershell
|
|
D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version
|
|
```
|
|
|
|
### Port 11434 Already in Use
|
|
|
|
**Error:** `Port 11434 is already in use`
|
|
|
|
**Fix:**
|
|
```powershell
|
|
# Find what's using the port
|
|
netstat -ano | findstr :11434
|
|
|
|
# Kill the process (replace PID)
|
|
taskkill /F /PID <PID>
|
|
|
|
# Restart Ollama
|
|
ollama serve
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Tips
|
|
|
|
### GPU Acceleration
|
|
|
|
**Ollama automatically uses your GPU if available (NVIDIA/AMD).**
|
|
|
|
**Check GPU usage:**
|
|
```powershell
|
|
# NVIDIA
|
|
nvidia-smi
|
|
|
|
# AMD
|
|
# Check Task Manager > Performance > GPU
|
|
```
|
|
|
|
### CPU Performance
|
|
|
|
If using CPU only:
|
|
- Smaller models (7b-8b) work better
|
|
- Expect 2-5 tokens/second
|
|
- Close other applications for better performance
|
|
|
|
### Faster Response Times
|
|
|
|
```powershell
|
|
# Use smaller models for speed
|
|
ollama pull mistral:7b
|
|
|
|
# Or quantized versions (smaller, faster)
|
|
ollama pull llama3.1:8b-q4_0
|
|
```
|
|
|
|
---
|
|
|
|
## Usage Examples
|
|
|
|
### Example 1: Private Code Review
|
|
|
|
```
|
|
I have some proprietary code I don't want to send to external APIs.
|
|
Can you use the local Ollama model to review it for security issues?
|
|
|
|
[Paste code]
|
|
```
|
|
|
|
Claude will use `analyze_code_local` to review locally.
|
|
|
|
### Example 2: Large File Summary
|
|
|
|
```
|
|
Summarize this 50,000 line log file using the local model to avoid API costs.
|
|
|
|
[Paste content]
|
|
```
|
|
|
|
Claude will use `summarize_large_file` locally.
|
|
|
|
### Example 3: Offline Development
|
|
|
|
```
|
|
I'm offline - can you still help with this code?
|
|
```
|
|
|
|
Claude will delegate to local Ollama model automatically.
|
|
|
|
---
|
|
|
|
## What Models to Use When
|
|
|
|
| Task | Best Model | Why |
|
|
|------|-----------|-----|
|
|
| Code review | qwen2.5-coder:7b | Trained specifically for code |
|
|
| Code generation | codellama:13b | Best code completion |
|
|
| General questions | llama3.1:8b | Balanced performance |
|
|
| Speed priority | mistral:7b | Fastest responses |
|
|
| Quality priority | llama3.1:70b | Best reasoning (needs GPU) |
|
|
|
|
---
|
|
|
|
## Uninstall
|
|
|
|
To remove the Ollama MCP server:
|
|
|
|
1. **Remove from `.mcp.json`:**
|
|
Delete the `ollama-assistant` entry
|
|
|
|
2. **Delete files:**
|
|
```powershell
|
|
Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant
|
|
```
|
|
|
|
3. **Uninstall Ollama (optional):**
|
|
```powershell
|
|
winget uninstall Ollama.Ollama
|
|
```
|
|
|
|
4. **Restart Claude Code**
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
Once installed:
|
|
1. Try asking me to use local Ollama for tasks
|
|
2. I'll automatically delegate when appropriate:
|
|
- Privacy-sensitive code
|
|
- Large files
|
|
- Offline work
|
|
- Cost optimization
|
|
|
|
The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.
|
|
|
|
---
|
|
|
|
**Status:** Ready to install
|
|
**Estimated Setup Time:** 10-15 minutes (including model download)
|
|
**Disk Space Required:** ~5-10GB (for models)
|