harness: fleet-wide functional-error + correction + friction logging

Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.

- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
  (b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
  + onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
  run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
  session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
  friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-15 11:39:43 -07:00
parent 927a06a0cf
commit 9960da5f9a
29 changed files with 388 additions and 36 deletions

View File

@@ -86,4 +86,8 @@ if [ "$HTTP" = "200" ]; then
fi
echo "[WARNING] post-bot-alert: Discord returned ${HTTP:-no-response}${BODY}" >&2
# Log the Discord POST failure (non-200 / unreachable) once. Do NOT route this
# through post-bot-alert itself — that would recurse; log-skill-error.sh only
# writes to errorlog.md. Soft-fail preserved: this never changes the exit 0.
bash "$ROOT/.claude/scripts/log-skill-error.sh" "post-bot-alert" "Discord POST failed (non-200/unreachable)" --context "channel=${CHANNEL_NAME} http=${HTTP:-none} resp=${BODY:0:80}" >/dev/null 2>&1 || true
exit 0