sync: Auto-sync from ACG-M-L5090 at 2026-01-22 19:22:24
Synced files: - Grepai optimization documentation - Ollama Assistant MCP server implementation - Session logs and context updates Machine: ACG-M-L5090 Timestamp: 2026-01-22 19:22:24 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
345
mcp-servers/ollama-assistant/INSTALL.md
Normal file
345
mcp-servers/ollama-assistant/INSTALL.md
Normal file
@@ -0,0 +1,345 @@
|
||||
# Ollama MCP Server Installation Guide
|
||||
|
||||
Follow these steps to set up local AI assistance for Claude Code.
|
||||
|
||||
---
|
||||
|
||||
## Step 1: Install Ollama
|
||||
|
||||
**Option A: Using winget (Recommended)**
|
||||
```powershell
|
||||
winget install Ollama.Ollama
|
||||
```
|
||||
|
||||
**Option B: Manual Download**
|
||||
1. Go to https://ollama.ai/download
|
||||
2. Download the Windows installer
|
||||
3. Run the installer
|
||||
|
||||
**Verify Installation:**
|
||||
```powershell
|
||||
ollama --version
|
||||
```
|
||||
|
||||
Expected output: `ollama version is X.Y.Z`
|
||||
|
||||
---
|
||||
|
||||
## Step 2: Start Ollama Server
|
||||
|
||||
**Start the server:**
|
||||
```powershell
|
||||
ollama serve
|
||||
```
|
||||
|
||||
Leave this terminal open - Ollama needs to run in the background.
|
||||
|
||||
**Tip:** Ollama usually starts automatically after installation. Check system tray for Ollama icon.
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Pull a Model
|
||||
|
||||
**Open a NEW terminal** and pull a model:
|
||||
|
||||
**Recommended for most users:**
|
||||
```powershell
|
||||
ollama pull llama3.1:8b
|
||||
```
|
||||
Size: 4.7GB | Speed: Fast | Quality: Good
|
||||
|
||||
**Best for code:**
|
||||
```powershell
|
||||
ollama pull qwen2.5-coder:7b
|
||||
```
|
||||
Size: 4.7GB | Speed: Fast | Quality: Excellent for code
|
||||
|
||||
**Alternative options:**
|
||||
```powershell
|
||||
# Faster, smaller
|
||||
ollama pull mistral:7b # 4.1GB
|
||||
|
||||
# Better quality, larger
|
||||
ollama pull llama3.1:70b # 40GB (requires good GPU)
|
||||
|
||||
# Code-focused
|
||||
ollama pull codellama:13b # 7.4GB
|
||||
```
|
||||
|
||||
**Verify model is available:**
|
||||
```powershell
|
||||
ollama list
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4: Test Ollama
|
||||
|
||||
```powershell
|
||||
ollama run llama3.1:8b "Explain what MCP is in one sentence"
|
||||
```
|
||||
|
||||
Expected: You should get a response from the model.
|
||||
|
||||
Press `Ctrl+D` or type `/bye` to exit the chat.
|
||||
|
||||
---
|
||||
|
||||
## Step 5: Setup MCP Server
|
||||
|
||||
**Run the setup script:**
|
||||
```powershell
|
||||
cd D:\ClaudeTools\mcp-servers\ollama-assistant
|
||||
.\setup.ps1
|
||||
```
|
||||
|
||||
This will:
|
||||
- Create Python virtual environment
|
||||
- Install MCP dependencies (mcp, httpx)
|
||||
- Check Ollama installation
|
||||
- Verify everything is configured
|
||||
|
||||
**Expected output:**
|
||||
```
|
||||
[OK] Python installed
|
||||
[OK] Virtual environment created
|
||||
[OK] Dependencies installed
|
||||
[OK] Ollama installed
|
||||
[OK] Ollama server is running
|
||||
[OK] Found compatible models
|
||||
Setup Complete!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6: Configure Claude Code
|
||||
|
||||
The `.mcp.json` file has already been updated with the Ollama configuration.
|
||||
|
||||
**Verify configuration:**
|
||||
```powershell
|
||||
cat D:\ClaudeTools\.mcp.json
|
||||
```
|
||||
|
||||
You should see an `ollama-assistant` entry.
|
||||
|
||||
---
|
||||
|
||||
## Step 7: Restart Claude Code
|
||||
|
||||
**IMPORTANT:** You must completely restart Claude Code for MCP changes to take effect.
|
||||
|
||||
1. Close Claude Code completely
|
||||
2. Reopen Claude Code
|
||||
3. Navigate to D:\ClaudeTools directory
|
||||
|
||||
---
|
||||
|
||||
## Step 8: Test Integration
|
||||
|
||||
Try these commands in Claude Code:
|
||||
|
||||
**Test 1: Check status**
|
||||
```
|
||||
Use the ollama_status tool to check if Ollama is running
|
||||
```
|
||||
|
||||
**Test 2: Ask a question**
|
||||
```
|
||||
Use ask_ollama to ask: "What is the fastest sorting algorithm?"
|
||||
```
|
||||
|
||||
**Test 3: Analyze code**
|
||||
```
|
||||
Use analyze_code_local to review this Python function for bugs:
|
||||
def divide(a, b):
|
||||
return a / b
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Ollama Not Running
|
||||
|
||||
**Error:** `Cannot connect to Ollama at http://localhost:11434`
|
||||
|
||||
**Fix:**
|
||||
```powershell
|
||||
# Start Ollama
|
||||
ollama serve
|
||||
|
||||
# Or check if it's already running
|
||||
netstat -ano | findstr :11434
|
||||
```
|
||||
|
||||
### Model Not Found
|
||||
|
||||
**Error:** `Model 'llama3.1:8b' not found`
|
||||
|
||||
**Fix:**
|
||||
```powershell
|
||||
# Pull the model
|
||||
ollama pull llama3.1:8b
|
||||
|
||||
# Verify it's installed
|
||||
ollama list
|
||||
```
|
||||
|
||||
### Python Virtual Environment Issues
|
||||
|
||||
**Error:** `python: command not found`
|
||||
|
||||
**Fix:**
|
||||
1. Install Python 3.8+ from python.org
|
||||
2. Add Python to PATH
|
||||
3. Rerun setup.ps1
|
||||
|
||||
### MCP Server Not Loading
|
||||
|
||||
**Check Claude Code logs:**
|
||||
```powershell
|
||||
# Look for MCP-related errors
|
||||
# Logs are typically in: %APPDATA%\Claude\logs\
|
||||
```
|
||||
|
||||
**Verify Python path:**
|
||||
```powershell
|
||||
D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version
|
||||
```
|
||||
|
||||
### Port 11434 Already in Use
|
||||
|
||||
**Error:** `Port 11434 is already in use`
|
||||
|
||||
**Fix:**
|
||||
```powershell
|
||||
# Find what's using the port
|
||||
netstat -ano | findstr :11434
|
||||
|
||||
# Kill the process (replace PID)
|
||||
taskkill /F /PID <PID>
|
||||
|
||||
# Restart Ollama
|
||||
ollama serve
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Tips
|
||||
|
||||
### GPU Acceleration
|
||||
|
||||
**Ollama automatically uses your GPU if available (NVIDIA/AMD).**
|
||||
|
||||
**Check GPU usage:**
|
||||
```powershell
|
||||
# NVIDIA
|
||||
nvidia-smi
|
||||
|
||||
# AMD
|
||||
# Check Task Manager > Performance > GPU
|
||||
```
|
||||
|
||||
### CPU Performance
|
||||
|
||||
If using CPU only:
|
||||
- Smaller models (7b-8b) work better
|
||||
- Expect 2-5 tokens/second
|
||||
- Close other applications for better performance
|
||||
|
||||
### Faster Response Times
|
||||
|
||||
```powershell
|
||||
# Use smaller models for speed
|
||||
ollama pull mistral:7b
|
||||
|
||||
# Or quantized versions (smaller, faster)
|
||||
ollama pull llama3.1:8b-q4_0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Private Code Review
|
||||
|
||||
```
|
||||
I have some proprietary code I don't want to send to external APIs.
|
||||
Can you use the local Ollama model to review it for security issues?
|
||||
|
||||
[Paste code]
|
||||
```
|
||||
|
||||
Claude will use `analyze_code_local` to review locally.
|
||||
|
||||
### Example 2: Large File Summary
|
||||
|
||||
```
|
||||
Summarize this 50,000 line log file using the local model to avoid API costs.
|
||||
|
||||
[Paste content]
|
||||
```
|
||||
|
||||
Claude will use `summarize_large_file` locally.
|
||||
|
||||
### Example 3: Offline Development
|
||||
|
||||
```
|
||||
I'm offline - can you still help with this code?
|
||||
```
|
||||
|
||||
Claude will delegate to local Ollama model automatically.
|
||||
|
||||
---
|
||||
|
||||
## What Models to Use When
|
||||
|
||||
| Task | Best Model | Why |
|
||||
|------|-----------|-----|
|
||||
| Code review | qwen2.5-coder:7b | Trained specifically for code |
|
||||
| Code generation | codellama:13b | Best code completion |
|
||||
| General questions | llama3.1:8b | Balanced performance |
|
||||
| Speed priority | mistral:7b | Fastest responses |
|
||||
| Quality priority | llama3.1:70b | Best reasoning (needs GPU) |
|
||||
|
||||
---
|
||||
|
||||
## Uninstall
|
||||
|
||||
To remove the Ollama MCP server:
|
||||
|
||||
1. **Remove from `.mcp.json`:**
|
||||
Delete the `ollama-assistant` entry
|
||||
|
||||
2. **Delete files:**
|
||||
```powershell
|
||||
Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant
|
||||
```
|
||||
|
||||
3. **Uninstall Ollama (optional):**
|
||||
```powershell
|
||||
winget uninstall Ollama.Ollama
|
||||
```
|
||||
|
||||
4. **Restart Claude Code**
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
Once installed:
|
||||
1. Try asking me to use local Ollama for tasks
|
||||
2. I'll automatically delegate when appropriate:
|
||||
- Privacy-sensitive code
|
||||
- Large files
|
||||
- Offline work
|
||||
- Cost optimization
|
||||
|
||||
The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.
|
||||
|
||||
---
|
||||
|
||||
**Status:** Ready to install
|
||||
**Estimated Setup Time:** 10-15 minutes (including model download)
|
||||
**Disk Space Required:** ~5-10GB (for models)
|
||||
Reference in New Issue
Block a user