docs: apply vix-inspired token efficiency optimizations

- CLAUDE.md: trim ~45 lines — compress Live State Tracking, Automatic
  Context Loading, File Placement, Ollama sections; add single-agent
  guidance for coupled explore→implement tasks
- CODING_GUIDELINES.md: add GrepAI-first rule with token cost rationale;
  add GuruRMM platform parity matrix and cross-platform coding standards
- OLLAMA.md: expand tier-0 scope to include diff summarization, error
  categorization, agent phase handoff summaries, client email drafts,
  ticket classification with priority

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-15 15:50:29 -07:00
parent 31088cb8de
commit ee900fd103
3 changed files with 147 additions and 97 deletions

View File

@@ -92,6 +92,8 @@ You are NOT an executor. You coordinate specialized agents and preserve your con
**DO NOT** query databases directly. **DO NOT** write production code. **DO NOT** run tests. **DO NOT** commit/push. **DO NOT** query databases directly. **DO NOT** write production code. **DO NOT** run tests. **DO NOT** commit/push.
**Single-agent for coupled tasks:** For explore → implement or explore → implement → review flows where the context is the same throughout, use one agent across all phases rather than spawning three. Each agent boundary is a cache miss and a context-handoff cost. Spawn separate agents only when tasks are genuinely independent or run in parallel.
### Model Routing (Complexity-Based) ### Model Routing (Complexity-Based)
| Tier | Model | When | | Tier | Model | When |
@@ -109,34 +111,16 @@ Pass `model: "haiku"` or `model: "opus"` explicitly. Omit for Tier 2. Tier 0 is
## Automatic Context Loading (CRITICAL) ## Automatic Context Loading (CRITICAL)
**BEFORE responding to the first message or when switching projects, AUTOMATICALLY load context:** Load context **before responding** when any trigger fires. Never ask for info that's already in CONTEXT.md.
### Trigger 1: Project Keywords Detected | Trigger | Action |
If user mentions **GuruRMM**, **Dataforth**, **tunnel**, **VASLOG**, **AD2**, **testdatadb**, etc: |---------|--------|
1. Read the matching project CONTEXT.md: | GuruRMM / Dataforth / project keywords | Read `projects/<project>/CONTEXT.md`, query coord API status + components |
- GuruRMM keywords → `projects/msp-tools/guru-rmm/CONTEXT.md` | "continue", "resume", "back to", "finish" | Read project CONTEXT.md, check coord API for locks + unread messages |
- Dataforth keywords → `projects/dataforth-dos/CONTEXT.md` | Servers, IPs, credentials, deploy questions | Read CONTEXT.md — answer from it, never ask |
- General → `CONTEXT.md` (root) | Uncertainty >5% about infra or recent work | Read CONTEXT.md before asking the user |
2. Query the coordination API for current state: `GET http://172.16.3.30:8001/api/coord/status` (no auth needed for status) and `GET /api/coord/components?project_key=<key>`.
3. THEN respond with full context.
### Trigger 2: Continuation/Resume Words CONTEXT.md locations: `projects/msp-tools/guru-rmm/CONTEXT.md`, `projects/dataforth-dos/CONTEXT.md`, `CONTEXT.md` (root).
If user says "continue", "let's work on", "back to", "resume", "finish":
1. Detect project from message, read project CONTEXT.md.
2. Query coordination API: `GET /api/coord/status` for active locks and in-progress workflows; `GET /api/coord/messages/unread-count?session_id=<this-session>` for pending messages.
3. Check for unread messages and display them before proceeding.
### Trigger 3: Infrastructure/Deployment Questions
If user asks about **servers**, **databases**, **credentials**, **deploy**, **IP**, **password**:
1. Check current directory for CONTEXT.md, then `projects/*/CONTEXT.md`.
2. Answer from CONTEXT.md — never ask for info that's already there.
### Trigger 4: Uncertainty >5%
If you're <95% certain about infrastructure, recent work, or next steps: read CONTEXT.md before asking the user.
### Anti-Pattern
Never ask "What did we do last time?" or "What's the server IP?" — read the CONTEXT.md first. If it's not there, then ask.
--- ---
@@ -167,53 +151,34 @@ Never ask "What did we do last time?" or "What's the server IP?" — read the CO
## Live State Tracking (ALL Projects) ## Live State Tracking (ALL Projects)
**The ClaudeTools coordination API is the live source of truth for ALL projects.** Every agent session MUST use it — not PROJECT_STATE.md files (those are archived). **Coord API is the live source of truth.** API base: `http://172.16.3.30:8001/api/coord` (no auth).
API base: `http://172.16.3.30:8001/api/coord` | No auth required for coord endpoints.
### Session Start Protocol (MANDATORY)
Run these at the beginning of every session:
### Session start
```bash ```bash
# 1. Check for messages addressed to this session or broadcast
curl -s "http://172.16.3.30:8001/api/coord/messages?to_session=<SESSION_ID>&unread_only=true" curl -s "http://172.16.3.30:8001/api/coord/messages?to_session=<SESSION_ID>&unread_only=true"
# 2. Check overall live status
curl -s "http://172.16.3.30:8001/api/coord/status" curl -s "http://172.16.3.30:8001/api/coord/status"
# 3. Check active locks on any project you plan to touch
curl -s "http://172.16.3.30:8001/api/coord/locks?project_key=<KEY>" curl -s "http://172.16.3.30:8001/api/coord/locks?project_key=<KEY>"
``` ```
Display unread messages before any work. Mark read: `PUT /api/coord/messages/<id>/read`
Display any unread messages prominently before any other work. Mark them read: ### Before significant work — claim a lock
```bash
curl -s -X PUT "http://172.16.3.30:8001/api/coord/messages/<id>/read"
```
### Before Significant Work (MANDATORY)
Claim a lock before editing code, running migrations, deploying, or touching shared resources:
```bash ```bash
curl -s -X POST http://172.16.3.30:8001/api/coord/locks \ curl -s -X POST http://172.16.3.30:8001/api/coord/locks \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"project_key":"gururmm","session_id":"DESKTOP-0O8A1RL/claude-main","resource":"server/src","description":"Adding credential endpoints","ttl_hours":2}' -d '{"project_key":"gururmm","session_id":"DESKTOP-0O8A1RL/claude-main","resource":"server/src","description":"...","ttl_hours":2}'
# Save the returned "id" for release
``` ```
### After Work Completes (or Fails) — MANDATORY ### After work — release lock + update component
```bash ```bash
# Release lock curl -s -X DELETE "http://172.16.3.30:8001/api/coord/locks/<id>?session_id=<SESSION_ID>"
curl -s -X DELETE "http://172.16.3.30:8001/api/coord/locks/<lock_id>?session_id=<SESSION_ID>"
# Update component state
curl -s -X PUT "http://172.16.3.30:8001/api/coord/components/gururmm/server" \ curl -s -X PUT "http://172.16.3.30:8001/api/coord/components/gururmm/server" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{"state":"deployed","version":"0.3.0","notes":"Credential store live","updated_by":"DESKTOP-0O8A1RL/claude-main"}' -d '{"state":"deployed","version":"0.3.0","notes":"...","updated_by":"DESKTOP-0O8A1RL/claude-main"}'
``` ```
### Project Keys and Components to Track **Softfail:** If API unreachable, continue work and log failed calls to `.claude/coord-queue.jsonl`. Drain on next `/sync`.
### Project keys
| project_key | Components | States | | project_key | Components | States |
|-------------|------------|--------| |-------------|------------|--------|
@@ -222,32 +187,7 @@ curl -s -X PUT "http://172.16.3.30:8001/api/coord/components/gururmm/server" \
| `dataforth-dos` | `app`, `db` | `active`, `idle`, `degraded` | | `dataforth-dos` | `app`, `db` | `active`, `idle`, `degraded` |
| `clients/<name>` | `(free-form)` | `(free-form)` | | `clients/<name>` | `(free-form)` | `(free-form)` |
### Softfail When Coordination API Is Unavailable Full protocol + inter-session messaging: `.claude/COORDINATION_PROTOCOL.md`
If the coord API is unreachable (connection refused, timeout, or 5xx):
1. **Do not block work.** Continue with the task.
2. Log the failed call to `.claude/coord-queue.jsonl` (one JSON object per line):
```json
{"ts":"2026-05-12T15:30:00Z","method":"PUT","path":"/api/coord/components/gururmm/server","body":{...}}
```
3. On the next session start or `/sync`, drain the queue:
```bash
# For each line in coord-queue.jsonl, replay the call, then remove the file if all succeed
```
If coord API returns 503 with `Retry-After`, wait that many seconds and retry once before queuing locally.
### Inter-Session Messages
Send messages to specific sessions or broadcast to a project:
```bash
curl -s -X POST http://172.16.3.30:8001/api/coord/messages \
-H "Content-Type: application/json" \
-d '{"from_session":"DESKTOP-0O8A1RL/claude-main","to_session":"HOWARD-HOME/claude-main","project_key":"gururmm","subject":"macOS build ready","body":"build-agents.sh marked TODO-MACOS."}'
# Omit to_session for a broadcast to everyone watching the project
```
Full protocol reference: `.claude/COORDINATION_PROTOCOL.md`
--- ---
@@ -313,25 +253,24 @@ Vault structure: `infrastructure/`, `clients/`, `services/`, `projects/`, `msp-t
## File Placement ## File Placement
- **Dataforth DOS work** → `projects/dataforth-dos/` - GuruRMM work → `projects/msp-tools/guru-rmm/` (submodule, stale reference copy of `azcomputerguru/gururmm`)
- **ClaudeTools API code** → `api/`, `migrations/` - GuruRMM session logs → root `session-logs/` (NOT the submodule)
- **GuruRMM work** → `projects/msp-tools/guru-rmm/` (code reference only — submodule, stale copy of `azcomputerguru/gururmm`) - Client work → `clients/[client-name]/`
- **GuruRMM session logs** → `session-logs/` (root, in claudetools — NOT committed to the gururmm submodule) - Session logs → project/client `session-logs/` subfolder; general work → root `session-logs/`
- **Client work** → `clients/[client-name]/` - Full guide: `.claude/FILE_PLACEMENT_GUIDE.md`
- **Session logs** → project or client `session-logs/` subfolder; general → root `session-logs/`
- **Full guide:** `.claude/FILE_PLACEMENT_GUIDE.md`
--- ---
## Local AI (Ollama) ## Local AI (Ollama)
Tier 0 — **Ollama is the documentation engine.** Route prose generation through it: commit messages, ticket comments, client notes, code docs. Claude reviews output, owns credentials/facts/execution. Session log narratives are written directly by Claude (Ollama too slow for /save). Tier 0 — **Ollama is the documentation and classification engine.** Route prose, summaries, and classification through it; Claude reviews before writing or posting.
- **DESKTOP-0O8A1RL:** `http://localhost:11434` | Machine | Endpoint |
- **Other machines:** `http://100.92.127.64:11434` (Tailscale required) |---------|----------|
- **Models:** `qwen3:14b` (all documentation/prose), `codestral:22b` (code suggestions — always review) | DESKTOP-0O8A1RL | `http://localhost:11434` |
- **Warm-start:** GrepAI keeps the Ollama service running; qwen3 VRAM swap is ~5s worst case, not 50s | Other | `http://100.92.127.64:11434` (Tailscale) |
- **Full reference:** `.claude/OLLAMA.md` (documentation engine scope, model selection, review policy)
Models: `qwen3:14b` (docs, prose, classification, summarization), `codestral:22b` (code suggestions — always review). Full reference: `.claude/OLLAMA.md`
### GrepAI (Semantic Code Search) ### GrepAI (Semantic Code Search)

View File

@@ -65,6 +65,23 @@ powershell.exe -Command '$x = 5; Write-Host $x'
--- ---
## Context Lookup — GrepAI First
Before reading any file for context, search with GrepAI or Grep. Only open a file when you need its full content for editing or line-by-line review.
| Goal | Tool |
|------|------|
| Find where a function is defined | `grepai_search` or `Grep` |
| Understand how a feature works | `grepai_search` |
| Find all callers of a function | `grepai_trace_callers` |
| Full file content needed (edit, review) | `Read` |
| Recent changes | `git log`, then `Read` specific file |
Reading a 500-line file to find one function costs ~3000 tokens. A targeted search costs ~100.
Never open a large file to scan for context. Search first, read only if the search is insufficient.
---
## Security ## Security
- Never hardcode credentials -- use SOPS vault or environment variables - Never hardcode credentials -- use SOPS vault or environment variables
@@ -104,4 +121,89 @@ All scripts and tools use ASCII status markers:
--- ---
**Last Updated:** 2026-05-12 ## GuruRMM Agent — Platform Parity
All agent features that are not inherently platform-specific must ship on Windows, Linux, and macOS.
A feature that silently no-ops on one platform is a gap, not a cross-platform implementation.
### The rule
> If you add or change a feature in the agent and the change is not blocked by OS-level APIs,
> you must implement or stub it on all three platforms in the same PR.
> If a real implementation is not feasible, add a `// TODO(platform): <os> — <reason>` comment
> and open a tracking item.
### cfg gating — choose the right target
| Condition | Attribute | When to use |
|-----------|-----------|-------------|
| Windows only | `#[cfg(windows)]` | Windows API (Win32, WMI, SCM, OpenSSH registry) |
| Linux + macOS | `#[cfg(unix)]` | POSIX: nix crate, signals, `/proc`, `/sys`, sockets |
| Linux only | `#[cfg(target_os = "linux")]` | `/sys/class/thermal`, systemd, procfs, D-Bus |
| macOS only | `#[cfg(target_os = "macos")]` | CoreFoundation, IOKit, launchd, NSStatusBar |
| Build flag | `#[cfg(feature = "native-service")]` | Service harness (Windows only in Cargo.toml) |
Never use `#[cfg(not(windows))]` as a proxy for "Linux + macOS works the same" without verifying
the macOS codepath. Linux and macOS diverge on `/sys`, D-Bus, and GUI IPC.
### Current parity matrix (as of 2026-05-15)
| Feature | Windows | Linux | macOS |
|---------|---------|-------|-------|
| CPU / memory / disk / network metrics | [OK] | [OK] | [OK] |
| Temperature via sysinfo | [OK] fallback | [WARN] empty if no hwmon | [WARN] empty if no sensors |
| Temperature via LibreHardwareMonitor | [OK] primary | N/A | N/A |
| Temperature via /sys/class/thermal | N/A | [GAP] not implemented | N/A |
| User detection (logged-in user) | [OK] | [OK] nix crate | [OK] nix crate |
| User idle time | [OK] GetLastInputInfo | [GAP] returns None | [GAP] returns None |
| IPC / tray | [OK] named pipe + WinTray | [GAP] stub no-op | [GAP] stub no-op |
| Watchdog (process monitor) | [OK] native-service | [GAP] stub no-op | [GAP] stub no-op |
| Script execution | [OK] cmd / PowerShell | [OK] bash / sh | [OK] bash / sh |
| Hardware inventory | [OK] WMI | [OK] /proc + lshw | [OK] system_profiler |
| Auto-updater | [OK] full | [OK] simpler | [OK] simpler |
| Checks (AV, updates, firewall) | [OK] full | [WARN] partial stub | [WARN] partial stub |
| Network discovery | [OK] | [OK] | [OK] |
### Known gaps — priority order
**1. Linux temperature collection** (`agent/src/metrics/mod.rs`)
- sysinfo `Components` returns empty on most Linux systems (requires kernel hwmon driver exposure).
- Correct approach: read `/sys/class/thermal/thermal_zone*/temp` directly (always available on Linux).
- Pattern:
```rust
#[cfg(target_os = "linux")]
fn collect_temps_linux() -> (Option<f32>, Option<f32>, Vec<TemperatureReading>) {
// read /sys/class/thermal/thermal_zone*/temp
// parse millidegrees, classify by type label in /sys/class/thermal/thermal_zone*/type
}
```
**2. Linux / macOS user idle time** (`agent/src/metrics/mod.rs` — `get_user_idle_time()`)
- Linux: use X11 `XScreenSaverQueryInfo` (display sessions) or parse `/proc/interrupts` delta (headless).
- macOS: use `CGEventSourceSecondsSinceLastEventType` (IOKit, always available).
- Stub is acceptable short-term; mark with `// TODO(platform): linux/macos idle time`.
**3. Watchdog on Linux / macOS** (`agent/src/watchdog/`)
- Windows: Windows Service Control Manager restarts the agent.
- Linux: systemd `Restart=on-failure` in the unit file is the correct equivalent — no in-process watchdog needed.
- macOS: launchd `KeepAlive` key in the plist.
- Document the OS-native mechanism in `build-agents.sh` / installer rather than porting the Rust watchdog.
**4. Checks on Linux / macOS** (`agent/src/checks.rs`)
- Windows-specific checks (Windows Update pending, Windows Defender status, Windows Firewall) have no
direct equivalents; that is expected.
- Cross-platform checks (disk SMART, certificate expiry, open ports) should run on all platforms.
- Add `// TODO(platform): linux/macos — <check name>` for each unimplemented cross-platform check.
### Cargo.toml dependency discipline
- Platform-specific crates go in `[target.'cfg(...)'.dependencies]`, never in `[dependencies]`.
- Keep `lhm` (LibreHardwareMonitor) and `windows-service` under `cfg(windows)`.
- Keep `nix` under `cfg(unix)`.
- When adding a new crate, verify it compiles on all three targets before merging. Use the build server
for Windows; CI covers Linux. macOS cross-compile via `--target aarch64-apple-darwin` on Linux
(requires `osxcross` toolchain — see build-agents.sh TODO-MACOS).
---
**Last Updated:** 2026-05-15

View File

@@ -85,6 +85,11 @@ This keeps Claude tokens focused on reasoning, decisions, and execution. Ollama
| Syncro comment bodies + billing descriptions | qwen3:14b | Review checklist + post via API | | Syncro comment bodies + billing descriptions | qwen3:14b | Review checklist + post via API |
| Ticket initial issue / description text | qwen3:14b | Review + post | | Ticket initial issue / description text | qwen3:14b | Review + post |
| Client-facing notes and summaries | qwen3:14b | Review for accuracy | | Client-facing notes and summaries | qwen3:14b | Review for accuracy |
| Ticket / issue classification (priority, type, category) | qwen3:14b | Review + apply label |
| Diff summarization before commit | qwen3:14b | Review + use in commit message |
| Error message categorization (transient / config / bug) | qwen3:14b | Review + act on classification |
| Agent phase handoff summaries (explore → plan, plan → implement) | qwen3:14b | Review + include in agent brief |
| Client email drafts | qwen3:14b | Review for accuracy + tone before sending |
| Code comments and docstrings | codestral:22b | Review before applying | | Code comments and docstrings | codestral:22b | Review before applying |
| Refactor suggestions | codestral:22b | Review before applying | | Refactor suggestions | codestral:22b | Review before applying |
@@ -149,8 +154,12 @@ print('warm')
| Commit message body | qwen3:14b | | Commit message body | qwen3:14b |
| Ticket / client comment drafting | qwen3:14b | | Ticket / client comment drafting | qwen3:14b |
| Summarize logs, diffs, incident notes | qwen3:14b | | Summarize logs, diffs, incident notes | qwen3:14b |
| Classify bug type, severity, category | qwen3:14b | | Classify bug type, severity, category, priority | qwen3:14b |
| Extract structured data from text | qwen3:14b | | Extract structured data from text | qwen3:14b |
| Diff summarization before commit | qwen3:14b |
| Error categorization (transient / config / bug) | qwen3:14b |
| Agent phase handoff summaries | qwen3:14b |
| Client email drafts | qwen3:14b |
| Code comment / docstring generation | codestral:22b | | Code comment / docstring generation | codestral:22b |
| Refactor suggestions | codestral:22b | | Refactor suggestions | codestral:22b |