Commit Graph

9 Commits

Author SHA1 Message Date
97f93dd6d7 docs: fix broken markdown tables in OLLAMA.md
The qwen3:8b routing update inserted footnote lines mid-table in both
the "What Ollama owns" and "When to Use Which Model" sections, splitting
each table in half so renderers treated the qwen3.6 rows as paragraph
text. Moved footnotes below the closing table row in both places.

Also updated the bottom "Rule of thumb" line: previously named qwen3:14b
with a "2x faster" claim that's now stale on DESKTOP-0O8A1RL where 8b is
the prose model. Generalized to "the per-machine prose model".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:54:20 -07:00
4aadf16a9f feat: add qwen3:8b for DESKTOP-0O8A1RL, update Ollama routing
Benchmarked 2026-05-16 on DESKTOP-0O8A1RL (RTX 5070 Ti Laptop, 12 GB VRAM):
- qwen3:8b: 100% VRAM fit (10.9/10.9 GB) -> 74-86 tok/s
- qwen3:14b: 73% VRAM (11.3/15.6 GB split) -> 17-18 tok/s (4.8x slower)
- qwen3.6:  41% VRAM (11.3/27.5 GB split) -> 17-19 tok/s

qwen3:14b overflows 12 GB VRAM at runtime (9.3 GB GGUF = 15.6 GB loaded).
qwen3:8b fits entirely in VRAM and matches the reference machine speed.

Updated OLLAMA.md: added qwen3:8b to models table, per-machine routing
table, benchmark results. Updated CLAUDE.md model one-liner.
Routing: qwen3:8b for prose on DESKTOP-0O8A1RL, qwen3:14b everywhere else,
qwen3.6 for strict-format tasks on all machines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-16 16:25:57 -07:00
2c5f10faaa Session log: qwen3.6 benchmark, route strict-format to 3.6
Benchmarked qwen3.6 (36B MoE) vs qwen3:14b and qwen3:32b on 16
representative prompts. qwen3.6 scored 15/16 vs 14b 11/16 and 32b
12/16, winning every strict-format/adherence test (multi-step rules,
weekend-aware scheduling, prompt-injection resistance, word-limit
summary). Single reasoning regression noted for re-check at qwen3.7.

Updated .claude/OLLAMA.md (Models, Documentation Engine, and
When-to-Use tables) and .claude/CLAUDE.md one-line model summary to
route strict-format work to qwen3.6 and keep bulk prose on qwen3:14b
(2x faster). Also removed openclaw npm package + ~/.openclaw data dir
earlier in the session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:03:07 -07:00
ee900fd103 docs: apply vix-inspired token efficiency optimizations
- CLAUDE.md: trim ~45 lines — compress Live State Tracking, Automatic
  Context Loading, File Placement, Ollama sections; add single-agent
  guidance for coupled explore→implement tasks
- CODING_GUIDELINES.md: add GrepAI-first rule with token cost rationale;
  add GuruRMM platform parity matrix and cross-platform coding standards
- OLLAMA.md: expand tier-0 scope to include diff summarization, error
  categorization, agent phase handoff summaries, client email drafts,
  ticket classification with priority

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 15:50:29 -07:00
4bec31e226 grepai: fix index staleness, mandate usage, document config for new machines
Index was dead since 2026-04-19 (watcher not running). Fixes:
- Watcher restarted; scheduled task registered for login persistence
- Removed .md 0.6x penalty — markdown is primary content in this repo
- Added session-logs/ 1.3x, .claude/ 1.2x, /clients/ 1.1x relevance bonuses
- CLAUDE.md: grepai_search is now the first step for any context lookup
- OLLAMA.md: documents config overrides + watcher setup for new machines

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:42:01 -07:00
88bdc3d4c9 docs: establish Ollama as the documentation engine
Route all prose generation (session logs, commit messages, Syncro
comments, client notes, code docs) through Ollama qwen3:14b by default.
Claude reviews output and owns verbatim-accuracy sections (credentials,
IPs, command outputs). GrepAI context lookups keep the Ollama service
warm, eliminating the 30-50s cold-start in normal workflow.

Updates: OLLAMA.md (documentation engine scope + warm-start note),
CLAUDE.md (Ollama section), save.md (narrative drafting), checkpoint.md
(commit message body drafting).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-24 07:37:45 -07:00
7e2e3a5882 sync: auto-sync from HOWARD-HOME at 2026-04-23 06:21:23
Author: Howard Enos
Machine: HOWARD-HOME
Timestamp: 2026-04-23 06:21:23
2026-04-23 06:21:24 -07:00
936ea49b33 fix: replace python3 with py/jq throughout scripts and docs
Windows Store python3 stub returns exit 49 instead of running Python.
Replace with: py (Windows launcher) for actual Python code, jq for
simple JSON extraction. Reorder fallback loops to try py first.
Add Bash(py:*) to settings.local.json allowlist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 12:14:43 -07:00
056e36aeac refactor: optimize CLAUDE.md context footprint (-49%)
Extract Ollama docs and PROJECT_STATE locking protocol to on-demand
reference files. Trim Work Mode to detection table only. Remove verbose
anti-pattern examples and credential encryption details.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 12:09:17 -07:00