Session log: Ollama + GrepAI setup, coordinator review policy

Installed Ollama with GPU support (qwen3:14b, codestral:22b, nomic-embed-text), configured GrepAI semantic code search with optimized 256-token chunks and context file boosting, added MCP server integration and deep-explore agent. Updated claude.md with local AI usage guidelines and 4-tier output review policy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 16:42:01 -07:00
parent 481b02ed46
commit 98ea867d2c
3 changed files with 265 additions and 2 deletions
--- a/.claude/agents/deep-explore.md
+++ b/.claude/agents/deep-explore.md
@@ -0,0 +1,59 @@
+---
+name: deep-explore
+description: Deep codebase exploration using grepai semantic search and call graph tracing. Use this agent for understanding code architecture, finding implementations by intent, analyzing function relationships, and exploring unfamiliar code areas.
+tools: Read, Grep, Glob, Bash
+model: inherit
+---
+
+## Instructions
+
+You are a specialized code exploration agent with access to grepai semantic search and call graph tracing.
+
+### Primary Tools
+
+#### 1. Semantic Search: `grepai search`
+
+Use this to find code by intent and meaning:
+
+```bash
+# Use English queries for best results (--compact saves ~80% tokens)
+grepai search "authentication flow" --json --compact
+grepai search "error handling middleware" --json --compact
+grepai search "database connection management" --json --compact
+```
+
+#### 2. Call Graph Tracing: `grepai trace`
+
+Use this to understand function relationships and code flow:
+
+```bash
+# Find all functions that call a symbol
+grepai trace callers "HandleRequest" --json
+
+# Find all functions called by a symbol
+grepai trace callees "ProcessOrder" --json
+
+# Build complete call graph
+grepai trace graph "ValidateToken" --depth 3 --json
+```
+
+Use `grepai trace` when you need to:
+- Find all callers of a function
+- Understand the call hierarchy
+- Analyze the impact of changes to a function
+- Map dependencies between components
+
+### When to use standard tools
+
+Only fall back to Grep/Glob when:
+- You need exact text matching (variable names, imports)
+- grepai is not available or returns errors
+- You need file path patterns
+
+### Workflow
+
+1. Start with `grepai search` to find relevant code semantically
+2. Use `grepai trace` to understand function relationships and call graphs
+3. Use `Read` to examine promising files in detail
+4. Use Grep only for exact string searches if needed
+5. Synthesize findings into a clear summary
--- a/.claude/claude.md
+++ b/.claude/claude.md
@@ -15,6 +15,7 @@ You are NOT an executor. You coordinate specialized agents and preserve your con
 | Git commits/push/branch | Gitea Agent |
 | Backups/restore | Backup Agent |
 | File exploration (broad) | Explore Agent |
+| Semantic code search | deep-explore Agent (uses GrepAI) |
 | Complex reasoning | General-purpose + Sequential Thinking |

 **Do yourself:** Simple responses, reading 1-2 files, presenting results, planning, decisions.
@@ -39,7 +40,7 @@ You are NOT an executor. You coordinate specialized agents and preserve your con

 - **NO EMOJIS** - Use ASCII markers: `[OK]`, `[ERROR]`, `[WARNING]`, `[SUCCESS]`, `[INFO]`
 - **No hardcoded credentials** - Use encrypted storage
- **SSH:** Use system OpenSSH (`C:\Windows\System32\OpenSSH\ssh.exe`), never Git for Windows SSH
+- **SSH:** Use system OpenSSH (on Windows: `C:\Windows\System32\OpenSSH\ssh.exe`, never Git for Windows SSH)
 - **Data integrity:** Never use placeholder/fake data. Check credentials.md or ask user.
 - **Full coding standards:** `.claude/CODING_GUIDELINES.md` (agents read on-demand, not every session)

@@ -85,6 +86,87 @@ When user references previous work, use `/context` command. Never ask user for i

 ---

+## Local AI (Ollama)
+
+Ollama runs locally with GPU acceleration. Use it for tasks that don't need Claude-level reasoning.
+
+### Available Models
+
+| Model | Size | Use For |
+|-------|------|---------|
+| `qwen3:14b` | 9.3 GB | General sub-tasks: summarization, classification, data extraction, drafting |
+| `codestral:22b` | 12 GB | Code-specific sub-tasks: code generation, refactoring suggestions, docstring generation |
+| `nomic-embed-text` | 274 MB | Embeddings only (used by GrepAI, not for direct use) |
+
+### GrepAI (Semantic Code Search)
+
+GrepAI indexes the codebase using `nomic-embed-text` embeddings and provides semantic search via MCP server.
+
+**When to use GrepAI instead of Grep/Glob:**
+- Finding code by intent ("how does authentication work") rather than exact text
+- Exploring unfamiliar areas of the codebase
+- Finding related implementations across files
+- Context recovery — searching session logs and credentials by meaning
+
+**How to use:**
+- **MCP tool:** Use the `grepai` MCP server tools directly (available after MCP loads)
+- **deep-explore agent:** Delegate to the `deep-explore` agent for thorough semantic exploration
+- **CLI fallback:** `grepai search "your query" --json --compact`
+
+**Maintenance:** The watcher daemon runs in the background and auto-indexes file changes. If search results seem stale, run `grepai watch --stop && grepai watch --background` to restart it.
+
+### Using Ollama for Sub-Tasks
+
+For bulk or repetitive work that doesn't require Claude's full reasoning, offload to local models via Ollama's API:
+
+**When to use Ollama:**
+- Processing many items in a loop (e.g., summarizing 50 session logs)
+- Generating boilerplate or repetitive code patterns
+- Data extraction/classification from structured text
+- Draft content that Claude will review/refine
+- Any task where speed > quality and results will be verified
+
+**When NOT to use Ollama (use Claude instead):**
+- Architectural decisions or complex reasoning
+- Security-sensitive code review
+- Tasks requiring tool use or multi-step planning
+- Final output that goes directly to production
+
+**How to call Ollama:**
+```bash
+# Simple prompt
+curl -s http://localhost:11434/api/generate -d '{"model":"qwen3:14b","prompt":"Summarize this: ...","stream":false}' | jq -r '.response'
+
+# Chat format
+curl -s http://localhost:11434/api/chat -d '{"model":"codestral:22b","messages":[{"role":"user","content":"Refactor this function: ..."}],"stream":false}' | jq -r '.message.content'
+```
+
+### Ollama Output Review Policy
+
+The coordinator (Claude) must review Ollama outputs based on impact level. Local models are useful but unreliable — they hallucinate, miss edge cases, and produce subtly wrong code.
+
+**Impact levels and review requirements:**
+
+| Level | Review | Examples |
+|-------|--------|----------|
+| **Critical** | ALWAYS review, verify against source | Code touching auth/security/encryption, credential handling, database migrations, production config, anything user-facing |
+| **High** | Review for correctness, spot-check details | API endpoint logic, business rules, infrastructure scripts, client-specific work |
+| **Medium** | Skim for obvious errors, trust if reasonable | Internal documentation drafts, session log summaries, data extraction from structured input, boilerplate code |
+| **Low** | Trust without review | Classification/tagging of items, reformatting text, generating placeholder content for later editing |
+
+**Review process for Critical/High:**
+1. Read Ollama's full output — don't just check if it "looks right"
+2. Verify claims against actual files/data (e.g., if it says a function exists, confirm it does)
+3. Check for: hallucinated function names, wrong parameter types, missing error handling, security gaps
+4. If output is wrong or uncertain, redo the task yourself rather than patching Ollama's attempt
+
+**Batch processing pattern:**
+When using Ollama for bulk tasks (e.g., processing N items), review the first 2-3 results fully before trusting the rest. If any are wrong, switch to doing it yourself or fix the prompt and reprocess.
+
+**Flag to user:** If Ollama produces output for a Critical task and you are not confident in your review, tell the user explicitly: "This was generated by a local model and I'm not fully confident in [specific concern]."
+
+---
+
 ## Reference (read on-demand, not every session)

 - **Project structure, endpoints, workflows, troubleshooting:** `.claude/REFERENCE.md`
@@ -94,4 +176,4 @@ When user references previous work, use `/context` command. Never ask user for i

 ---

-**Last Updated:** 2026-02-17
+**Last Updated:** 2026-03-20
--- a/session-logs/2026-03-20-session.md
+++ b/session-logs/2026-03-20-session.md
@@ -419,3 +419,125 @@ sudo chown -R azcomputerguru:staff /Users/azcomputerguru/ClaudeTools/.git/object
 1. **KVOI bio** — Ready to publish, may need similar for radio.azcomputerguru.com
 2. **VMware scan at VWP** — Need VPN access to scan 192.168.0.x and 192.168.3.x
 3. **Install nmap on MacBook Air** — Would improve network scanning: `brew install nmap`
+
+---
+
+## Update: 15:45 — Ollama + GrepAI Setup on CachyOS Workstation
+
+### Session Summary
+
+Set up local AI infrastructure on acg-guru-5070 (CachyOS workstation). Installed Ollama with NVIDIA GPU support, pulled three models, installed and configured GrepAI for semantic code search, configured MCP server integration for Claude Code, and updated coordinator directives in `.claude/claude.md` with Ollama usage policies and review thresholds.
+
+### Work Completed
+
+#### 1. Ollama Installation
+- **Install method:** Official install script (`curl -fsSL https://ollama.com/install.sh | sh`)
+- **Location:** `/usr/local/bin/ollama`
+- **Service:** systemd (`ollama.service`), enabled on boot, auto-starts
+- **GPU:** NVIDIA RTX 5070 Ti Mobile detected automatically
+
+#### 2. Models Pulled
+
+| Model | Size | Purpose |
+|-------|------|---------|
+| `qwen3:14b` | 9.3 GB | General sub-tasks: summarization, classification, data extraction, drafting |
+| `codestral:22b` | 12 GB | Code-specific sub-tasks: code generation, refactoring suggestions |
+| `nomic-embed-text` | 274 MB | Embeddings for GrepAI semantic search |
+
+#### 3. GrepAI Installation & Configuration
+- **Version:** v0.35.0
+- **Install:** Official install script (`curl -sSL https://raw.githubusercontent.com/yoanbernabeu/grepai/main/install.sh | sh`)
+- **Location:** `/usr/local/bin/grepai`
+- **Config:** `/home/guru/ClaudeTools/.grepai/config.yaml`
+- **Index stats:** 1,437 files / 20,945 chunks / 118.3 MB
+- **Chunk size:** 256 tokens (optimized from default 512, matching previous Windows setup)
+- **Watcher:** Running as background daemon (PID 2665677)
+- **Watcher log:** `/home/guru/.local/state/grepai/logs/grepai-worktree-37becac32343.log`
+
+**Search boost config applied:**
+- `credentials.md` — 1.5x boost
+- `directives.md` — 1.5x boost
+- `/session-logs/` — 1.4x boost
+- `/.claude/` — 1.3x boost
+- `.md` penalty removed (was 0.6x default, now neutral)
+
+**Verified working:** `grepai search "SSH credentials"` correctly ranked `credentials.md` first (score 1.08)
+
+#### 4. MCP Server Integration
+- **Config file:** `/home/guru/.claude/projects/-home-guru-ClaudeTools/settings.json`
+- **Server:** `grepai mcp-serve` with cwd `/home/guru/ClaudeTools`
+- **Requires:** Claude Code restart to load
+
+#### 5. deep-explore Agent
+- Created by `grepai agent-setup --with-subagent`
+- **File:** `.claude/agents/deep-explore.md`
+- Provides semantic search + call graph tracing via Bash commands to grepai CLI
+
+#### 6. claude.md Updates (Coordinator Directives)
+
+Added to `.claude/claude.md` (syncs to all stations via Gitea):
+
+**a) Delegation table:** Added `deep-explore` agent for semantic code search
+
+**b) Local AI (Ollama) section:**
+- Available models table
+- GrepAI usage guidance (when to use vs Grep/Glob, how to use via MCP/agent/CLI)
+- Ollama sub-task guidance (when to offload vs use Claude, API examples)
+
+**c) Ollama Output Review Policy — 4 impact tiers:**
+
+| Level | Review Required | Examples |
+|-------|----------------|----------|
+| Critical | ALWAYS review + verify against source | Auth/security code, credentials, DB migrations, production config, user-facing output |
+| High | Review for correctness, spot-check | API logic, business rules, infra scripts, client work |
+| Medium | Skim for obvious errors | Internal docs, session summaries, boilerplate |
+| Low | Trust without review | Classification, reformatting, placeholders |
+
+- Batch processing rule: review first 2-3 items before trusting the rest
+- Flag-to-user rule: if local model output is Critical and review is uncertain, explicitly tell user
+
+**d) Cross-platform fix:** SSH path note updated to cover both Windows and Linux
+
+### Problems Encountered & Solutions
+
+| Problem | Solution |
+|---------|----------|
+| `grepai index --force` command not found | v0.35.0 removed standalone `index` command — indexing is handled by `grepai watch` |
+| GrepAI watcher log directory missing | Created `/home/guru/.local/state/grepai/logs/` manually |
+| Both model pulls interrupted by wifi change | Ollama handles reconnection automatically — pulls resumed fine |
+
+### Files Created
+- `/home/guru/.claude/projects/-home-guru-ClaudeTools/settings.json` — MCP server config for GrepAI
+- `/home/guru/ClaudeTools/.grepai/config.yaml` — GrepAI config (customized)
+- `/home/guru/ClaudeTools/.claude/agents/deep-explore.md` — GrepAI exploration subagent
+
+### Files Modified
+- `/home/guru/ClaudeTools/.claude/claude.md` — Added Ollama section, review policy, delegation update, date bump
+
+### Key Commands Reference
+```bash
+# Ollama
+ollama list                             # Show installed models
+ollama run qwen3:14b                    # Interactive general chat
+ollama run codestral:22b                # Interactive code chat
+systemctl status ollama                 # Check service
+
+# Ollama API
+curl -s http://localhost:11434/api/generate -d '{"model":"qwen3:14b","prompt":"...","stream":false}' | jq -r '.response'
+curl -s http://localhost:11434/api/chat -d '{"model":"codestral:22b","messages":[{"role":"user","content":"..."}],"stream":false}' | jq -r '.message.content'
+
+# GrepAI
+grepai status                           # Index health
+grepai search "query" --json --compact  # Semantic search
+grepai watch --status                   # Watcher status
+grepai watch --stop                     # Stop watcher
+grepai watch --background               # Start watcher daemon
+grepai trace callers "FuncName"         # Call graph
+```
+
+### Pending/Incomplete
+1. **Restart Claude Code** — Required to load GrepAI MCP server
+2. **Verify MCP integration** — Test `grepai` tools work after restart
+3. **Commit and push** — `.claude/claude.md` changes need to sync to Gitea for other stations
+4. **GrepAI watcher auto-start** — Currently a backgrounded process, not a systemd service. Consider creating `~/.config/systemd/user/grepai-watcher.service` for persistence across reboots
+5. **Java 8 still default JRE** — Switch back if needed: `sudo archlinux-java set java-25-openjdk`