harness: fleet-wide functional-error + correction + friction logging

Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.

- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
  (b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
  + onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
  run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
  session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
  friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-15 11:39:43 -07:00
parent 927a06a0cf
commit 9960da5f9a
29 changed files with 388 additions and 36 deletions

View File

@@ -98,6 +98,23 @@ After creating the files:
- Commands: Tell them to use `/{name}` or `/{name} arguments`
3. Remind them to update CLAUDE.md's Commands & Skills table if they want it documented there
## Mandatory: functional error logging
**Every skill MUST report genuine functional errors to `errorlog.md`** via the canonical
helper, so failures can be linted and fed back into skill improvements (CLAUDE.md core rule).
Bake this into the skill from the start:
- In each skill **script's failure branches** (API/auth failure, unexpected response,
validation error, unexpected non-zero exit), call:
```bash
bash "$ROOT/.claude/scripts/log-skill-error.sh" "<skill-name>" "<brief error>" --context "op=... http=..."
```
It stamps date+machine, inserts in the standard `YYYY-MM-DD | MACHINE | skill | error`
format, and soft-fails (never breaks the caller). Python skills shell out to the same helper.
- In the **SKILL.md**, add a line under workflow/guidelines: "On a functional error, log it via
`log-skill-error.sh` before surfacing it."
- Do NOT log expected/handled conditions (no results, no unread, user-declined) — only real failures.
## Quality Checklist
Before finalizing, verify:
@@ -107,6 +124,7 @@ Before finalizing, verify:
- [ ] File is in the correct location (`.claude/skills/` or `.claude/commands/`)
- [ ] Name uses kebab-case and is concise
- [ ] For skills with auto-triggers: triggers are specific enough to avoid false positives
- [ ] **Functional errors are logged** via `log-skill-error.sh` in the script's failure branches
## Tips for Good Skills/Commands