Commit Graph

8 Commits

Author SHA1 Message Date
4aadf16a9f feat: add qwen3:8b for DESKTOP-0O8A1RL, update Ollama routing
Benchmarked 2026-05-16 on DESKTOP-0O8A1RL (RTX 5070 Ti Laptop, 12 GB VRAM):
- qwen3:8b: 100% VRAM fit (10.9/10.9 GB) -> 74-86 tok/s
- qwen3:14b: 73% VRAM (11.3/15.6 GB split) -> 17-18 tok/s (4.8x slower)
- qwen3.6:  41% VRAM (11.3/27.5 GB split) -> 17-19 tok/s

qwen3:14b overflows 12 GB VRAM at runtime (9.3 GB GGUF = 15.6 GB loaded).
qwen3:8b fits entirely in VRAM and matches the reference machine speed.

Updated OLLAMA.md: added qwen3:8b to models table, per-machine routing
table, benchmark results. Updated CLAUDE.md model one-liner.
Routing: qwen3:8b for prose on DESKTOP-0O8A1RL, qwen3:14b everywhere else,
qwen3.6 for strict-format tasks on all machines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 16:25:57 -07:00
2c5f10faaa Session log: qwen3.6 benchmark, route strict-format to 3.6
Benchmarked qwen3.6 (36B MoE) vs qwen3:14b and qwen3:32b on 16
representative prompts. qwen3.6 scored 15/16 vs 14b 11/16 and 32b
12/16, winning every strict-format/adherence test (multi-step rules,
weekend-aware scheduling, prompt-injection resistance, word-limit
summary). Single reasoning regression noted for re-check at qwen3.7.

Updated .claude/OLLAMA.md (Models, Documentation Engine, and
When-to-Use tables) and .claude/CLAUDE.md one-line model summary to
route strict-format work to qwen3.6 and keep bulk prose on qwen3:14b
(2x faster). Also removed openclaw npm package + ~/.openclaw data dir
earlier in the session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:03:07 -07:00
ee900fd103 docs: apply vix-inspired token efficiency optimizations
- CLAUDE.md: trim ~45 lines — compress Live State Tracking, Automatic
  Context Loading, File Placement, Ollama sections; add single-agent
  guidance for coupled explore→implement tasks
- CODING_GUIDELINES.md: add GrepAI-first rule with token cost rationale;
  add GuruRMM platform parity matrix and cross-platform coding standards
- OLLAMA.md: expand tier-0 scope to include diff summarization, error
  categorization, agent phase handoff summaries, client email drafts,
  ticket classification with priority

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 15:50:29 -07:00
4bec31e226 grepai: fix index staleness, mandate usage, document config for new machines
Index was dead since 2026-04-19 (watcher not running). Fixes:
- Watcher restarted; scheduled task registered for login persistence
- Removed .md 0.6x penalty — markdown is primary content in this repo
- Added session-logs/ 1.3x, .claude/ 1.2x, /clients/ 1.1x relevance bonuses
- CLAUDE.md: grepai_search is now the first step for any context lookup
- OLLAMA.md: documents config overrides + watcher setup for new machines

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:42:01 -07:00
88bdc3d4c9 docs: establish Ollama as the documentation engine
Route all prose generation (session logs, commit messages, Syncro
comments, client notes, code docs) through Ollama qwen3:14b by default.
Claude reviews output and owns verbatim-accuracy sections (credentials,
IPs, command outputs). GrepAI context lookups keep the Ollama service
warm, eliminating the 30-50s cold-start in normal workflow.

Updates: OLLAMA.md (documentation engine scope + warm-start note),
CLAUDE.md (Ollama section), save.md (narrative drafting), checkpoint.md
(commit message body drafting).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:37:45 -07:00
7e2e3a5882 sync: auto-sync from HOWARD-HOME at 2026-04-23 06:21:23
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-23 06:21:23
2026-04-23 06:21:24 -07:00
936ea49b33 fix: replace python3 with py/jq throughout scripts and docs
Windows Store python3 stub returns exit 49 instead of running Python.
Replace with: py (Windows launcher) for actual Python code, jq for
simple JSON extraction. Reorder fallback loops to try py first.
Add Bash(py:*) to settings.local.json allowlist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 12:14:43 -07:00
056e36aeac refactor: optimize CLAUDE.md context footprint (-49%)
Extract Ollama docs and PROJECT_STATE locking protocol to on-demand
reference files. Trim Work Mode to detection table only. Remove verbose
anti-pattern examples and credential encryption details.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 12:09:17 -07:00