Session log: Ollama + GrepAI setup, coordinator review policy

Installed Ollama with GPU support (qwen3:14b, codestral:22b, nomic-embed-text),
configured GrepAI semantic code search with optimized 256-token chunks and
context file boosting, added MCP server integration and deep-explore agent.
Updated claude.md with local AI usage guidelines and 4-tier output review policy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 16:42:01 -07:00
parent 481b02ed46
commit 98ea867d2c
3 changed files with 265 additions and 2 deletions

View File

@@ -15,6 +15,7 @@ You are NOT an executor. You coordinate specialized agents and preserve your con
| Git commits/push/branch | Gitea Agent |
| Backups/restore | Backup Agent |
| File exploration (broad) | Explore Agent |
| Semantic code search | deep-explore Agent (uses GrepAI) |
| Complex reasoning | General-purpose + Sequential Thinking |
**Do yourself:** Simple responses, reading 1-2 files, presenting results, planning, decisions.
@@ -39,7 +40,7 @@ You are NOT an executor. You coordinate specialized agents and preserve your con
- **NO EMOJIS** - Use ASCII markers: `[OK]`, `[ERROR]`, `[WARNING]`, `[SUCCESS]`, `[INFO]`
- **No hardcoded credentials** - Use encrypted storage
- **SSH:** Use system OpenSSH (`C:\Windows\System32\OpenSSH\ssh.exe`), never Git for Windows SSH
- **SSH:** Use system OpenSSH (on Windows: `C:\Windows\System32\OpenSSH\ssh.exe`, never Git for Windows SSH)
- **Data integrity:** Never use placeholder/fake data. Check credentials.md or ask user.
- **Full coding standards:** `.claude/CODING_GUIDELINES.md` (agents read on-demand, not every session)
@@ -85,6 +86,87 @@ When user references previous work, use `/context` command. Never ask user for i
---
## Local AI (Ollama)
Ollama runs locally with GPU acceleration. Use it for tasks that don't need Claude-level reasoning.
### Available Models
| Model | Size | Use For |
|-------|------|---------|
| `qwen3:14b` | 9.3 GB | General sub-tasks: summarization, classification, data extraction, drafting |
| `codestral:22b` | 12 GB | Code-specific sub-tasks: code generation, refactoring suggestions, docstring generation |
| `nomic-embed-text` | 274 MB | Embeddings only (used by GrepAI, not for direct use) |
### GrepAI (Semantic Code Search)
GrepAI indexes the codebase using `nomic-embed-text` embeddings and provides semantic search via MCP server.
**When to use GrepAI instead of Grep/Glob:**
- Finding code by intent ("how does authentication work") rather than exact text
- Exploring unfamiliar areas of the codebase
- Finding related implementations across files
- Context recovery — searching session logs and credentials by meaning
**How to use:**
- **MCP tool:** Use the `grepai` MCP server tools directly (available after MCP loads)
- **deep-explore agent:** Delegate to the `deep-explore` agent for thorough semantic exploration
- **CLI fallback:** `grepai search "your query" --json --compact`
**Maintenance:** The watcher daemon runs in the background and auto-indexes file changes. If search results seem stale, run `grepai watch --stop && grepai watch --background` to restart it.
### Using Ollama for Sub-Tasks
For bulk or repetitive work that doesn't require Claude's full reasoning, offload to local models via Ollama's API:
**When to use Ollama:**
- Processing many items in a loop (e.g., summarizing 50 session logs)
- Generating boilerplate or repetitive code patterns
- Data extraction/classification from structured text
- Draft content that Claude will review/refine
- Any task where speed > quality and results will be verified
**When NOT to use Ollama (use Claude instead):**
- Architectural decisions or complex reasoning
- Security-sensitive code review
- Tasks requiring tool use or multi-step planning
- Final output that goes directly to production
**How to call Ollama:**
```bash
# Simple prompt
curl -s http://localhost:11434/api/generate -d '{"model":"qwen3:14b","prompt":"Summarize this: ...","stream":false}' | jq -r '.response'
# Chat format
curl -s http://localhost:11434/api/chat -d '{"model":"codestral:22b","messages":[{"role":"user","content":"Refactor this function: ..."}],"stream":false}' | jq -r '.message.content'
```
### Ollama Output Review Policy
The coordinator (Claude) must review Ollama outputs based on impact level. Local models are useful but unreliable — they hallucinate, miss edge cases, and produce subtly wrong code.
**Impact levels and review requirements:**
| Level | Review | Examples |
|-------|--------|----------|
| **Critical** | ALWAYS review, verify against source | Code touching auth/security/encryption, credential handling, database migrations, production config, anything user-facing |
| **High** | Review for correctness, spot-check details | API endpoint logic, business rules, infrastructure scripts, client-specific work |
| **Medium** | Skim for obvious errors, trust if reasonable | Internal documentation drafts, session log summaries, data extraction from structured input, boilerplate code |
| **Low** | Trust without review | Classification/tagging of items, reformatting text, generating placeholder content for later editing |
**Review process for Critical/High:**
1. Read Ollama's full output — don't just check if it "looks right"
2. Verify claims against actual files/data (e.g., if it says a function exists, confirm it does)
3. Check for: hallucinated function names, wrong parameter types, missing error handling, security gaps
4. If output is wrong or uncertain, redo the task yourself rather than patching Ollama's attempt
**Batch processing pattern:**
When using Ollama for bulk tasks (e.g., processing N items), review the first 2-3 results fully before trusting the rest. If any are wrong, switch to doing it yourself or fix the prompt and reprocess.
**Flag to user:** If Ollama produces output for a Critical task and you are not confident in your review, tell the user explicitly: "This was generated by a local model and I'm not fully confident in [specific concern]."
---
## Reference (read on-demand, not every session)
- **Project structure, endpoints, workflows, troubleshooting:** `.claude/REFERENCE.md`
@@ -94,4 +176,4 @@ When user references previous work, use `/context` command. Never ask user for i
---
**Last Updated:** 2026-02-17
**Last Updated:** 2026-03-20