memory: record GuruRMM log-analysis cutover to Claude Haiku (root cause + deploy shape)

This commit is contained in:
2026-06-12 07:16:41 -07:00
parent 9587e91b15
commit 7f06e47f09
2 changed files with 43 additions and 1 deletions

View File

@@ -133,3 +133,4 @@
- [Use `python` not `python3` on GURU-5070](python3-shim-use-python.md) — `python3` in Git bash hits the flaky MS Store shim; real interpreters are `python` (3.12) / `py` (3.14). coord.py + wiki-compile work via `python`; the coord lock IS claimable here - [Use `python` not `python3` on GURU-5070](python3-shim-use-python.md) — `python3` in Git bash hits the flaky MS Store shim; real interpreters are `python` (3.12) / `py` (3.14). coord.py + wiki-compile work via `python`; the coord lock IS claimable here
- [Beast = primary GuruRMM Windows build host](gururmm-beast-windows-build-host.md) — GURU-BEAST-ROG (i9), reached from .30 via Tailscale-on-.30 at 100.101.122.4 as guru; Pluto is the fallback (`attempt_build beast || attempt_build pluto`). WiX must be 4.x (v6+ = OSMF); Beast NuGet needed nuget.org added - [Beast = primary GuruRMM Windows build host](gururmm-beast-windows-build-host.md) — GURU-BEAST-ROG (i9), reached from .30 via Tailscale-on-.30 at 100.101.122.4 as guru; Pluto is the fallback (`attempt_build beast || attempt_build pluto`). WiX must be 4.x (v6+ = OSMF); Beast NuGet needed nuget.org added
- [GuruRMM command_type gotcha](reference_gururmm_command_type.md) — only shell/powershell/python/script/claude_task (+cmd alias); unknown type silently dropped, looks like a black-hole - [GuruRMM command_type gotcha](reference_gururmm_command_type.md) — only shell/powershell/python/script/claude_task (+cmd alias); unknown type silently dropped, looks like a black-hole
- [GuruRMM log analysis -> Claude Haiku](gururmm-log-analysis-claude-cutover.md) — cut over from Ollama-on-Beast (timed out on fleet-sized prompts; "unreachable" was a mislabeled 120s timeout) to Anthropic API Haiku 4.5 w/ structured outputs; key at vault `projects/gururmm/anthropic-api`; ZDR pending; deploy needs root on .30 (.env + restart)

View File

@@ -0,0 +1,41 @@
---
name: gururmm-log-analysis-claude-cutover
description: GuruRMM log analysis cut over from Ollama-on-Beast to Claude Haiku 4.5; why, and the deploy shape
metadata:
type: project
---
GuruRMM server log analysis (`server/src/api/logs.rs`, `analyze_logs_with_*`) was
cut over from **Ollama (qwen3:14b on Beast, `100.101.122.4:11434`)** to the
**Anthropic API (Claude Haiku 4.5)** on 2026-06-12 (decision: Mike).
**Why — the "Ollama unreachable" error was a mislabeled timeout, not reachability.**
The server VM `.30` (gururmm, `172.16.3.30`) reaches Beast fine for `/api/tags` and
short warm `/api/chat` (warm "say OK" = 1.1s), but a fleet-sized `/api/chat`
(~1500 log lines / ~17KB) never completes — it hit the curl 300s ceiling even warm.
Cause is qwen3:14b's minutes-long inference on a big prompt over a flaky cross-LAN
tailnet (`.30` is behind symmetric NAT — `MappingVariesByDestIP:true`; Beast is on
Wi-Fi `10.2.51.228`). reqwest's 120s timeout surfaced as
`error sending request ... Check Tailscale`, which read as "unreachable." Beast
also had a **duplicate-Ollama bind conflict** (the desktop tray app's `ollama serve`
couldn't bind 11434; the older standalone PID 14144 held `0.0.0.0:11434` and served)
— noisy but not the cause. See [[gururmm-beast-windows-build-host]] for Beast.
**The fix.** `analyze_logs_with_claude()` POSTs `https://api.anthropic.com/v1/messages`
with `x-api-key` from env, reading `ANTHROPIC_API_KEY` (required) and `ANTHROPIC_MODEL`
(default `claude-haiku-4-5`). Uses **structured outputs** (`output_config.format` +
json_schema) so the reply is guaranteed-parseable findings JSON (no fence stripping).
Cloud over plain HTTPS — no tailnet, no Beast. Validated end-to-end against Haiku
(200, ~1-6s, correct findings). `cargo check` clean.
**Secrets / privacy.** Key vaulted at `projects/gururmm/anthropic-api` (vault convention:
per-project key, mint its own). **ZDR requested from Anthropic, pending** — org-level,
not a console toggle (email sales@anthropic.com). Test fleet OK to run before ZDR
confirms; don't point a production fleet at it until ZDR is live.
**Deploy shape.** Production server is a **native binary** `/opt/gururmm/gururmm-server`
via systemd, `EnvironmentFile=/opt/gururmm/.env` (both root-only; `guru` can't write
.env or restart). A CI pipeline builds/ships the binary on commit (`[ci-version-bump]`
commits). `.30` has no cargo. So deploy = commit/push (CI builds binary) **+ a root
action on `.30`** to add `ANTHROPIC_API_KEY` (and optional `ANTHROPIC_MODEL`) to
`/opt/gururmm/.env` and restart `gururmm-server`.