Files
claudetools/.claude/scripts/log-skill-error.sh
Mike Swanson 9960da5f9a harness: fleet-wide functional-error + correction + friction logging
Add .claude/scripts/log-skill-error.sh — the canonical agent error log helper
(writes errorlog.md in DATE | MACHINE | skill | [type] error format, soft-fails).
Three categories: execution failures (default), user corrections (--correction),
and preventable self-inflicted friction (--friction; cite ref= when it repeats a
documented gotcha). Goal: stop paying tokens twice for the same avoidable mistake.

- CLAUDE.md: make logging mandatory for all skills + corrections + friction.
- skill-creator: new skills must wire in the helper (guidance + checklist).
- Retrofit every skill script's genuine failure branches to call the helper
  (b2/bitdefender/mailprotector/packetdial/coord python CLIs; remediation-tool
  + onboard365 bash; vault, rmm-auth, post-bot-alert, agy, grok, 1password,
  run-onboarding-diagnostic). Handled conditions + self-tests left alone.
- errorlog.md: broaden header to cover skills + harness + corrections; seed this
  session's corrections (INKY, Mail.Send token-audience, omnibox-strictness) and
  friction (git-bash /tmp, env-persistence, argv-limit, PowerShell var-case).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 11:40:25 -07:00

90 lines
4.1 KiB
Bash

#!/usr/bin/env bash
# log-skill-error.sh — append an entry to errorlog.md in the canonical format,
# for later linting that feeds skill fixes, CLAUDE.md rules, and memory cleanup.
#
# Despite the name this is the GENERAL agent error/correction/friction log — it
# captures three things (see --type below):
# 1. skill/command FUNCTIONAL failures (API/auth/unexpected-response/bad-exit)
# 2. user CORRECTIONS of an improper assumption I made (--correction)
# 3. preventable self-inflicted FRICTION that wasted tokens (--friction) —
# harness/env/tool misuse, ESPECIALLY a repeat of an already-documented
# gotcha (that means a rule or memory isn't working and needs strengthening)
#
# Do NOT call it for expected/handled conditions (a search with no matches, a
# "no unread messages", a user declining a prompt) — only real, preventable,
# pattern-worthy events.
#
# Usage:
# bash log-skill-error.sh <skill-or-command> "<brief error>"
# echo "<brief error>" | bash log-skill-error.sh <skill-or-command>
# bash log-skill-error.sh <skill> "<error>" --context "op=send id=123 http=403"
# bash log-skill-error.sh <skill/context> "<what I wrongly assumed + the correction>" --correction
#
# Categories (all feed the lint that improves skills, CLAUDE.md, and memory):
# (default) execution failure — API/auth failure, unexpected response, bad exit.
# --correction — the USER corrected an improper assumption/approach I made.
# --friction — preventable self-inflicted error that wasted tokens (harness/env/
# tool misuse). If it repeats a documented gotcha, note it in
# --context (e.g. ref=feedback_tmp_path_windows) — that's the signal
# a rule/memory needs strengthening.
# (--type <other> also supported; tags the error column as [<type>].)
# bash log-skill-error.sh <context> "<what wasted tokens + the fix>" --friction --context "ref=<memory>"
#
# Writes: YYYY-MM-DD | MACHINE | <skill> | [<type>] <error> [ctx: <context>]
# (newest entry inserted at the top, just under the append marker).
#
# Soft-fail by design: this NEVER breaks the caller. Missing log, missing jq,
# empty message -> prints a [WARN] to stderr and exits 0.
set -u
ROOT="${CLAUDETOOLS_ROOT:-$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)}"
SKILL="${1:-unknown}"; shift || true
CONTEXT=""
ETYPE="" # "" / exec = execution failure; "correction" = user corrected a bad assumption
ARGS=()
while [ $# -gt 0 ]; do
case "$1" in
--context) CONTEXT="${2:-}"; shift 2;;
--type) ETYPE="${2:-}"; shift 2;;
--correction) ETYPE="correction"; shift;;
--friction) ETYPE="friction"; shift;;
*) ARGS+=("$1"); shift;;
esac
done
MSG="${ARGS[*]:-}"
if [ -z "$MSG" ] && [ ! -t 0 ]; then MSG="$(cat)"; fi
if [ -z "$MSG" ]; then echo "[WARN] log-skill-error: empty message, nothing logged" >&2; exit 0; fi
LOG="$ROOT/errorlog.md"
if [ ! -f "$LOG" ]; then echo "[WARN] log-skill-error: $LOG not found" >&2; exit 0; fi
DATE="$(date -u +%F)"
IDF="$ROOT/.claude/identity.json"
MACHINE=""
if command -v jq >/dev/null 2>&1 && [ -f "$IDF" ]; then
MACHINE="$(jq -r '.machine_name // .hostname // empty' "$IDF" 2>/dev/null)"
fi
[ -z "$MACHINE" ] && MACHINE="$(hostname 2>/dev/null || echo unknown)"
# normalize whitespace/newlines so each entry is one line
MSG="$(printf '%s' "$MSG" | tr '\n' ' ' | sed 's/[[:space:]]\{1,\}/ /g; s/^ //; s/ $//')"
[ -n "$CONTEXT" ] && MSG="$MSG [ctx: $CONTEXT]"
# Tag non-execution categories at the start of the error column for easy linting
# (e.g. grep "\[correction\]" errorlog.md to surface improper-assumption patterns).
if [ -n "$ETYPE" ] && [ "$ETYPE" != "exec" ]; then MSG="[$ETYPE] $MSG"; fi
ENTRY="$DATE | $MACHINE | $SKILL | $MSG"
MARK="<!-- Append entries below this line -->"
TMP="$LOG.tmp.$$"
if awk -v entry="$ENTRY" -v mark="$MARK" '
{ print }
($0==mark && !done) { print ""; print entry; done=1 }
END { if (!done) { print ""; print entry } } # marker missing -> append at end
' "$LOG" > "$TMP" 2>/dev/null && mv "$TMP" "$LOG" 2>/dev/null; then
echo "[OK] logged skill error to errorlog.md ($SKILL)"
else
rm -f "$TMP" 2>/dev/null
echo "[WARN] log-skill-error: could not write $LOG" >&2
fi
exit 0