Files

Mike Swanson 9960da5f9a harness: fleet-wide functional-error + correction + friction logging

Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.

- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
  (b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
  + onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
  run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
  session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
  friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-15 11:40:25 -07:00

6.1 KiB

Raw Blame History

Error Log

Brief records of preventable, pattern-worthy events across the fleet — used to improve skills, write better CLAUDE.md rules, and clean stale/misleading memory. The aim: never pay tokens twice for the same avoidable mistake. Append newest at the top; keep entries to 1-2 lines. Always write via the helper, never by hand: bash .claude/scripts/log-skill-error.sh "<skill/context>" "<brief>" [--correction|--friction] [--context "k=v"]

Format: YYYY-MM-DD | MACHINE | command/skill/context | [type] error (brief) [ctx: ...]

Categories (the [type] tag): (none) = skill/command execution failure · [correction] = user corrected an improper assumption I made · [friction] = preventable self-inflicted token-waste (harness/env/tool misuse; cite a ref= in ctx when it repeats a documented gotcha — that flags a rule/memory to strengthen).

2026-06-15 | GURU-5070 | powershell/var-case | [friction] PowerShell vars are case-INSENSITIVE: $gUid silently overwrote $guid (GPO id), Set-ADObject hit a bad DN and left GPT.ini/AD versionNumber inconsistent until fixed. Never rely on case to distinguish PS variables

2026-06-15 | GURU-5070 | python/argv-limit | [friction] passed full /api/agents JSON (248 agents) as a python CLI arg -> 'Argument list too long' on Windows. Pipe large payloads via stdin, not argv

2026-06-15 | GURU-5070 | bash/env-persist | [friction] re-derived RMM token every call after $TOKEN/$RMM vanished between Bash tool calls - shell env does NOT persist across calls; must re-eval auth (or chain) in the same command

2026-06-15 | GURU-5070 | bash/tmp-path | [friction] wrote curl -o /tmp/x.json then jq read it back and failed (No such file) - Git-Bash vs Write/tool /tmp resolve differently. Pipe directly or use repo-relative paths. REPEAT of documented gotcha [ctx: ref=feedback_tmp_path_windows]

2026-06-15 | GURU-5070 | DMARC / DNS | [correction] assumed ACG's own INKY rua convention (reports-sg.inkydmarc.com) applied to a client domain; only use the INKY rua if THAT client is onboarded to INKY - otherwise plain p=none or a real mailbox

2026-06-15 | GURU-5070 | remediation-tool (sendMail) | [correction] assumed none of the consented apps could send mail and started granting Graph Mail.Send; the Exchange Operator app ALREADY had Graph Mail.Send - I was decoding the EXO-audience token, not a Graph-audience token. Mint a Graph token for the app before concluding a permission is missing

2026-06-15 | GURU-5070 | rmm-search | [correction] assumed the CLI search must replicate the UI Omnibox scoreMatch exactly; user wants a FLEXIBLE forgiving multi-field search optimized for first-try correctness, not UI parity

2026-06-15 | GURU-BEAST-ROG | /syncro (comment edit) | Syncro API does not expose a comment-edit or comment-delete endpoint — once posted, comments can only be modified via the GUI. Bot posted an internal resolution note with an unwanted "Performed by: ClaudeTools Discord Bot" line and could not remove it programmatically. Remediation needed: either suppress bot-attribution lines from internal notes by default, or add a GUI-edit step to the workflow when the note needs correction.

2026-06-14 | GURU-5070 | mailbox skill (Graph token) | FABB app fabb3421 (Claude-MSP-Access / "Cloud MSP Access") token request returned AADSTS700016 — app/SP no longer present in azcomputerguru.com tenant (deleted; gotchas.md already marked it deprecated). Blocks /mailbox + the M365 contacts task. Verified the remediation suite (live, ACG tenant) carries NO Mail.Send/Mail.ReadWrite/Contacts scopes (investigator has Mail.Read only) — so a straight repoint can't restore mailbox-send/contacts. Pending Mike decision: stand up a single-tenant ACG-internal mailbox app vs. add scopes to a suite tier. [2026-06-15] Docs hardened — gotchas.md now marks fabb3421 DELETED with the Mail/Contacts-scope blast radius + flags the 3 legacy "old app only" tenants (Valleywide/Dataforth/Cascades) as now having NO working remediation app (migration URGENT); mailbox.md carries a BLOCKED/AADSTS700016 banner. DECISION 2026-06-15 (Mike): Mail.Send goes into the suite (Exchange Operator tier) since its real use is IR victim-notification during mailbox takeovers; add Mail.Send to the exchange-op manifest + consent, repoint mailbox.md to exchange-op. Implementation not yet executed (production app change, needs go).

2026-06-14 | GURU-KALI | coord skill (coord.py) | Documented invocation py .claude/skills/coord/scripts/coord.py ... failed exit 127 — py (the Windows py-launcher) does not exist on Linux. Worked around with python3. [RESOLVED 2026-06-14] Added .claude/scripts/py.sh (resolves the working interpreter: identity.json python.command -> py -> python3 -> python, skipping the MS Store shim) and repointed all skill/command DOC invocations from bare py to bash "$CLAUDETOOLS_ROOT/.claude/scripts/py.sh". The .sh skill scripts already resolved internally — left untouched. Broadcast to fleet.

2026-06-14 | GURU-BEAST-ROG | coord skill (coord.py msg send) | py "$CLAUDETOOLS_ROOT/.claude/skills/coord/scripts/coord.py" failed — $CLAUDETOOLS_ROOT is not exported in fresh Git-bash shells here, so the path resolved under C:\Program Files\Git\. [RESOLVED 2026-06-14] Added .claude/scripts/ensure-settings-env.py (seeds env.CLAUDETOOLS_ROOT in per-machine settings.local.json from identity.json); Claude Code injects it into every Bash call. Wired into ONBOARDING.md + broadcast to fleet. Effective next session start.

2026-06-14 | GURU-BEAST-ROG | /sync (sync.sh Phase 3, submodule update) | submodule projects/msp-tools/guru-rmm checkout of f38da05 aborted: untracked docs/RMM_THOUGHTS.md would be overwritten. Parent repo synced fine; submodule pointer left lagging. Recurring transient. [RESOLVED 2026-06-15] sync.sh now has resolve_submodule_collisions() — on the abort it moves only the untracked files the incoming commit tracks aside to <file>.synced-aside-<UTCstamp> (content preserved, NOT --force) then retries once. Verified live: guru-rmm advanced ed92097->f38da05; the aside copy held 94 lines of un-committed 2026-06-08 thoughts (rescued, not lost — needs manual merge into canonical RMM_THOUGHTS.md).

6.1 KiB Raw Blame History

Error Log

6.1 KiB

Raw Blame History