Files

Mike Swanson eca8fe820e sync: Auto-sync from ACG-M-L5090 at 2026-01-22 19:22:24

Synced files:
- Grepai optimization documentation
- Ollama Assistant MCP server implementation
- Session logs and context updates

Machine: ACG-M-L5090
Timestamp: 2026-01-22 19:22:24

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-01-22 19:23:16 -07:00

6.2 KiB

Raw Permalink Blame History

Ollama MCP Server Installation Guide

Follow these steps to set up local AI assistance for Claude Code.

Step 1: Install Ollama

Option A: Using winget (Recommended)

winget install Ollama.Ollama

Option B: Manual Download

Go to https://ollama.ai/download
Download the Windows installer
Run the installer

Verify Installation:

ollama --version

Expected output: ollama version is X.Y.Z

Step 2: Start Ollama Server

Start the server:

ollama serve

Leave this terminal open - Ollama needs to run in the background.

Tip: Ollama usually starts automatically after installation. Check system tray for Ollama icon.

Step 3: Pull a Model

Open a NEW terminal and pull a model:

Recommended for most users:

ollama pull llama3.1:8b

Size: 4.7GB | Speed: Fast | Quality: Good

Best for code:

ollama pull qwen2.5-coder:7b

Size: 4.7GB | Speed: Fast | Quality: Excellent for code

Alternative options:

# Faster, smaller
ollama pull mistral:7b        # 4.1GB

# Better quality, larger
ollama pull llama3.1:70b      # 40GB (requires good GPU)

# Code-focused
ollama pull codellama:13b     # 7.4GB

Verify model is available:

ollama list

Step 4: Test Ollama

ollama run llama3.1:8b "Explain what MCP is in one sentence"

Expected: You should get a response from the model.

Press Ctrl+D or type /bye to exit the chat.

Step 5: Setup MCP Server

Run the setup script:

cd D:\ClaudeTools\mcp-servers\ollama-assistant
.\setup.ps1

This will:

Create Python virtual environment
Install MCP dependencies (mcp, httpx)
Check Ollama installation
Verify everything is configured

Expected output:

[OK] Python installed
[OK] Virtual environment created
[OK] Dependencies installed
[OK] Ollama installed
[OK] Ollama server is running
[OK] Found compatible models
Setup Complete!

Step 6: Configure Claude Code

The .mcp.json file has already been updated with the Ollama configuration.

Verify configuration:

cat D:\ClaudeTools\.mcp.json

You should see an ollama-assistant entry.

Step 7: Restart Claude Code

IMPORTANT: You must completely restart Claude Code for MCP changes to take effect.

Close Claude Code completely
Reopen Claude Code
Navigate to D:\ClaudeTools directory

Step 8: Test Integration

Try these commands in Claude Code:

Test 1: Check status

Use the ollama_status tool to check if Ollama is running

Test 2: Ask a question

Use ask_ollama to ask: "What is the fastest sorting algorithm?"

Test 3: Analyze code

Use analyze_code_local to review this Python function for bugs:
def divide(a, b):
    return a / b

Troubleshooting

Ollama Not Running

Error: Cannot connect to Ollama at http://localhost:11434

Fix:

# Start Ollama
ollama serve

# Or check if it's already running
netstat -ano | findstr :11434

Model Not Found

Error: Model 'llama3.1:8b' not found

Fix:

# Pull the model
ollama pull llama3.1:8b

# Verify it's installed
ollama list

Python Virtual Environment Issues

Error: python: command not found

Fix:

Install Python 3.8+ from python.org
Add Python to PATH
Rerun setup.ps1

MCP Server Not Loading

Check Claude Code logs:

# Look for MCP-related errors
# Logs are typically in: %APPDATA%\Claude\logs\

Verify Python path:

D:\ClaudeTools\mcp-servers\ollama-assistant\venv\Scripts\python.exe --version

Port 11434 Already in Use

Error: Port 11434 is already in use

Fix:

# Find what's using the port
netstat -ano | findstr :11434

# Kill the process (replace PID)
taskkill /F /PID <PID>

# Restart Ollama
ollama serve

Performance Tips

GPU Acceleration

Ollama automatically uses your GPU if available (NVIDIA/AMD).

Check GPU usage:

# NVIDIA
nvidia-smi

# AMD
# Check Task Manager > Performance > GPU

CPU Performance

If using CPU only:

Smaller models (7b-8b) work better
Expect 2-5 tokens/second
Close other applications for better performance

Faster Response Times

# Use smaller models for speed
ollama pull mistral:7b

# Or quantized versions (smaller, faster)
ollama pull llama3.1:8b-q4_0

Usage Examples

Example 1: Private Code Review

I have some proprietary code I don't want to send to external APIs.
Can you use the local Ollama model to review it for security issues?

[Paste code]

Claude will use analyze_code_local to review locally.

Example 2: Large File Summary

Summarize this 50,000 line log file using the local model to avoid API costs.

[Paste content]

Claude will use summarize_large_file locally.

Example 3: Offline Development

I'm offline - can you still help with this code?

Claude will delegate to local Ollama model automatically.

What Models to Use When

Task	Best Model	Why
Code review	qwen2.5-coder:7b	Trained specifically for code
Code generation	codellama:13b	Best code completion
General questions	llama3.1:8b	Balanced performance
Speed priority	mistral:7b	Fastest responses
Quality priority	llama3.1:70b	Best reasoning (needs GPU)

Uninstall

To remove the Ollama MCP server:

Remove from .mcp.json: Delete the ollama-assistant entry

Delete files:

Remove-Item -Recurse D:\ClaudeTools\mcp-servers\ollama-assistant

Uninstall Ollama (optional):
```
winget uninstall Ollama.Ollama
```
Restart Claude Code

Next Steps

Once installed:

Try asking me to use local Ollama for tasks
I'll automatically delegate when appropriate:
- Privacy-sensitive code
- Large files
- Offline work
- Cost optimization

The integration is transparent - you can work normally and I'll decide when to use local vs. cloud AI.

Status: Ready to install Estimated Setup Time: 10-15 minutes (including model download) Disk Space Required: ~5-10GB (for models)

6.2 KiB Raw Permalink Blame History

Ollama MCP Server Installation Guide

Step 1: Install Ollama

Step 2: Start Ollama Server

Step 3: Pull a Model

Step 4: Test Ollama

Step 5: Setup MCP Server

Step 6: Configure Claude Code

Step 7: Restart Claude Code

Step 8: Test Integration

Troubleshooting

Ollama Not Running

Model Not Found

Python Virtual Environment Issues

MCP Server Not Loading

Port 11434 Already in Use

Performance Tips

GPU Acceleration

CPU Performance

Faster Response Times

Usage Examples

Example 1: Private Code Review

Example 2: Large File Summary

Example 3: Offline Development

What Models to Use When

Uninstall

Next Steps

6.2 KiB

Raw Permalink Blame History