From 97f93dd6d79b744ac89636435324e9fb14898fb3 Mon Sep 17 00:00:00 2001 From: Mike Swanson Date: Sat, 16 May 2026 16:54:20 -0700 Subject: [PATCH] docs: fix broken markdown tables in OLLAMA.md The qwen3:8b routing update inserted footnote lines mid-table in both the "What Ollama owns" and "When to Use Which Model" sections, splitting each table in half so renderers treated the qwen3.6 rows as paragraph text. Moved footnotes below the closing table row in both places. Also updated the bottom "Rule of thumb" line: previously named qwen3:14b with a "2x faster" claim that's now stale on DESKTOP-0O8A1RL where 8b is the prose model. Generalized to "the per-machine prose model". Co-Authored-By: Claude Opus 4.7 (1M context) --- .claude/OLLAMA.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/.claude/OLLAMA.md b/.claude/OLLAMA.md index 827ebf6..7b2cba1 100644 --- a/.claude/OLLAMA.md +++ b/.claude/OLLAMA.md @@ -112,8 +112,6 @@ This keeps Claude tokens focused on reasoning, decisions, and execution. Ollama | Client-facing notes and summaries | qwen3:14b / qwen3:8b* | Review for accuracy | | Agent phase handoff summaries (explore → plan, plan → implement) | qwen3:14b / qwen3:8b* | Review + include in agent brief | | Client email drafts | qwen3:14b / qwen3:8b* | Review for accuracy + tone before sending | - -*Use `qwen3:8b` on DESKTOP-0O8A1RL — 4.8x faster due to full VRAM fit. Use `qwen3:14b` everywhere else. | Ticket / issue classification (priority, type, category) | qwen3.6 | Review + apply label | | Diff summarization before commit | qwen3.6 | Review + use in commit message | | Error message categorization (transient / config / bug) | qwen3.6 | Review + act on classification | @@ -124,6 +122,8 @@ This keeps Claude tokens focused on reasoning, decisions, and execution. Ollama | Code comments and docstrings | codestral:22b | Review before applying | | Refactor suggestions | codestral:22b | Review before applying | +\* Use `qwen3:8b` on DESKTOP-0O8A1RL — 4.8x faster than 14b there due to full VRAM fit. Use `qwen3:14b` on all other machines. + ### What Claude always owns (never Ollama) - Credentials, passwords, API keys — must be verbatim accurate @@ -187,8 +187,6 @@ print('warm') | Summarize logs, diffs, incident notes (no length cap) | qwen3:8b* / qwen3:14b | | Agent phase handoff summaries | qwen3:8b* / qwen3:14b | | Client email drafts | qwen3:8b* / qwen3:14b | - -*On DESKTOP-0O8A1RL only — 4.8x faster (86 tok/s vs 18 tok/s). Use qwen3:14b on all other machines. | Classify bug type, severity, category, priority | qwen3.6 | | Extract structured data from text (JSON, fields) | qwen3.6 | | Diff summarization with strict format / fields | qwen3.6 | @@ -200,7 +198,9 @@ print('warm') | Code comment / docstring generation | codestral:22b | | Refactor suggestions | codestral:22b | -**Rule of thumb:** if the output is *prose someone will read*, use qwen3:14b (2x faster). If the output is *structured data something will parse* or *must obey a tight format*, use qwen3.6. +\* On DESKTOP-0O8A1RL only — 4.8x faster (86 tok/s vs 18 tok/s). Use `qwen3:14b` on all other machines. + +**Rule of thumb:** if the output is *prose someone will read*, use the per-machine prose model (qwen3:8b on DESKTOP-0O8A1RL, qwen3:14b elsewhere). If the output is *structured data something will parse* or *must obey a tight format*, use qwen3.6. ## Review Policy